
US005918232A 

United States Patent [19] [ii] Patent Number: 5,918,232 

Pouschine et al. [45] Date of Patent: Jun. 29, 1999 



[54] MULTIDIMENSIONAL DOMAIN MODELING 
METHOD AND SYSTEM 

[75] Inventors: Nicholas Pouschine, Fremont; Kenner 
G. Stross, Oakland; Michael L. Brill, 

San Francisco, all of Calif. 

[73] Assignee: Whitelight Systems, Inc., Palo Alto, 
Calif. 



[21] 


Appl. No.: 08/978,168 




[22] 


Filed: Nov. 26, 1997 




[51] 


Int. CI. 6 


G06F 17/30 


[52] 


U.S. CI 


707/103; 707/2; 707/3; 






707/4 


[58] 


Field of Search 


707/103, 2, 3, 






707/4 



[56] References Cited 

U.S. PATENT DOCUMENTS 

5,367,619 11/1994 Dipaolo et al 395/149 

5,495,608 2/1996 Antoshenkov . 

5,560,007 9/1996 Thai. 

5,592,666 1/1997 Perez . 

5,664,172 9/1997 Antoshenkov. 

5,666,528 9/1997 Thai. 

5,742,738 4/1998 Koza et al 395/13 

Primary Examiner — Wayne Amsbury 

Assistant Examiner— Cheryl R. Lewis 

Attorney, Agent, or Firm— Larry B. Guernsey; Michael J. 

Hughes 



[57] ABSTRACT 

A system and method for computer modeling (10) and for 
creating hyperstructures (51) which are to be contained in a 
computer memory, which obtains measurements of physical 
objects and activities which arc related to the entity to be 
modeled in the computer hyperstructure (51). The measure- 
ments are transformed into computer data which corre- 
sponds to the physical objects and activities external to the 
computer system (10). A plurality of independent dimen- 
sions (54) are created, where each dimension (54) includes 
at least one element (58). A plurality of cells (56) arc created, 
each of which is associated with the intersection of two or 
more elements (58), each cell (56) being capable of storing 
at least one value. At least one rule domain (60) is associated 
with at least one cell (56), the rule domain (60) including at 
least one rule for assigning values to the associated cells 
(56). A domain modeling rule set (126) is prepared (300), 
which determines which of the rules will provide the value 
associated with each of the cells (56) wherein application of 
the domain modeling rule set (126) to the hyperstructure 
(51) causes a physical transformation of the data corre- 
sponding to said physical objects which are modeled in said 
hyperstructure (51). 

Also disclosed is a method for querying computer hyper- 
structures (51), a Hyperstructure Query Language, and a 
"cell explorer", which allows direct viewing of the applied 
formulas that produce a specific value for a cell (56). 
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MULTIDIMENSIONAL DOMAIN MODELING 
METHOD AND SYSTEM 

TECHNICAL FIELD 

The present invention relates generally to database 
manipulation techniques, and more particularly to analytical 
processing methods. The inventors anticipate that primary 
application of the present invention will be for modeling of 
various aspects of businesses as an aid in decision making 
and for long-term strategy formulation. 

BACKGROUND ART 

The modem world can be characterized in terms of 
ever-increasing complexity. In the business world in 
particular, most companies face intense competition, and 
constantly changing consumer preferences. These compa- 
nies must deal with the impact of new laws and regulations 
in some areas while experiencing uncertainty from the 
effects of deregulation in other areas. A business can expect 
competition not only from other local businesses but also 
from national and international companies as well. The 
broad scope of worldwide competition further increases the 
number of variables to be considered. In an effort to deal 
more effectively with this volatile environment, many busi- 
nesses have turned to computer modeling to more accurately 
calculate the effect of changing variables on business per- 
formance. 

Computer modeling allows great quantities of data to be 
utilized in the construction of data models. A data model is 
a collection of conceptual tools for describing real-world 
processes to be modeled based on data from the database, 
and the relationships among these processes. There are 
several variations of the database organization methods that 
are used in the construction of data models. A relational 
database uses a method in which files are linked together as 
required. In a non-relational system, such as hierarchical or 
network type, records in one file contain embedded pointers 
to the locations of records in another. These are fixed links 
which are set up in advance to speed up processing. In a 
relational database, relationships between files are created 
by comparing data, such as accounts and names. A relational 
system is more flexible, and can take two or more files and 
generate a new file that meets matching criteria, such as the 
names of customers who bought a certain product. 

In order to use relational databases, a Structured Query 
Language (SQL) was developed to interrogate and process 
data. SQL can be used as a fill-blown programming 
language, or SQL commands can be embedded within a 
programming language to interface with a database. 

The DataBase Management System (DBMS) is the soft- 
ware that manages the database. It allows for creation of 
tables and the processing done in relation to these tables. 
Most all DBMSs today are actually Relational DBMSs 
(RDBMS), which manage the relationships between tables. 
The DBMS creates tables by defining individual elements of 
data that fill them. These data fields are called "attributes" in 
a relational database. When data is entered, updated, 
queried, or when reports are generated, the application 
communicates with the DBMS in SQL. 

A Decision Support System (DSS) is an information and 
planning system that allows a user to analyze information 
and predict the impact of decisions before they are imple- 
mented. A DSS is an integrated set of programs that share 
data. These programs might also retrieve from external 
sources industry data that can be compared and used for 
statistical purposes. 
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Large companies may have millions of customers and 
thousands of products. When a trend in the marketing of a 
certain product or product line is to be tracked, multiple 
years worth of data may be required, resulting in a huge 

5 amount of data to be manipulated. Managers and executives 
may want views of several aspects of the data, such as 
summaries by region, product type and year. This kind of 
search is not easily bandied by a relational database. 
In response to business needs of this sort, a database 

30 architecture called the On Line Analytical Processing data- 
base (OLAP) has been developed. This data model is 
conceptualized as a multidimensional cube or block com- 
posed of smaller "cells". A simple three-dimensional model 
can be visualized as a cube in which data resides at the 

35 intersections of elements which comprise the dimensions. A 
simple example would have divisions of a company "A", 
"B", and "C" along an "X" axis dimension, time in years, 
such as "1996", "1997", and "1998" along a "Y" axis 
dimension, and a financial dimension including the "Sales", 

20 "Units", and "Profits" along a "Z" axis. The individual 
components of the dimensions are called "elements", and in 
this example include A, B and C from the "company" 
dimension, Sales, Units and Profits from the "financial" 
dimension, etc. Once this model is constructed, a user can 

25 specify data to be extracted from a particular cell of the 
model, for example the projected profits expected from B 
company in the year 1998. The contents of the cells may be 
accessed from database records for known quantities or they 
may be calculated by applying known business or math- 

30 ematical principles to data from the database. 

Within the larger classification of OLAP there are two 
main variations, MOLAP and ROLAP. 
Multidimensional OLAP (MOLAP) allows the user to 

35 load data into a proprietary format database and precalculate 
all the cells in a cube. Users can then enter queries against 
that cube. Since the entire cube is calculated in advance, 
response to queries can be very fast. However, loading and 
precalculating the entire cube can take many hours or days, 

4Q and changes to the data or model are not easily implemented. 
MOLAP may result in the unnecessary expenditure of effort 
and expense, since all cells are precomputed and stored, 
regardless of whether anyone ever requires their data. 
Because of the necessity to construct each cell, MOLAP is 

45 limited to relatively small datasets, and generally MOLAPs 
are poorly integrated with underlying relational databases. 
MOLAPs also have the disadvantage of being complex to 
administer. Additionally, the MOLAP cubes cannot share 
data, or metadata (data that describes the included data) with 

5Q other cubes. Introduction of hypothetical information to 
produce "what-if ' type analysis is allowed on only a limited 
basis. 

MOLAP is especially used in budgeting and planning 
applications. Examples of MOLAP software are Oracle 

55 Express™ and Arbor Essbase™. 

Relational OLAP (ROLAP) allows the data to stay in a 
data warehouse until a user submits a query. The ROLAP 
server receives the query, and translates it into SQL queries 
that are sent to the data warehouse. The server then receives 

60 the data, performs further calculations if required and returns 
the results to the user. In contrast to MOLAP, cells in the 
ROLAP cube are calculated as required "on the fly" in a 
virtual cube, and the cells are discarded after the query is 
complete. 

65 Traditional ROLAP also has its own disadvantages. Busi- 
ness modeling is generally very limited, and the database 
typically needs much tuning and administrative attention. 
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Performance may be marginal, and processing of hypolheti- Still another object of the invention is to provide a 

cal information for "what-if" analysis is usually not pos- computer modeling system and method which is less com- 

sible. Also ROLAP products generally support only a small plicated than the prior art, and which non-programmers may 

number of interactive users. use t0 build and analyze models. 

ROLAP is especially used in sales and marketing analy- 5 A object of the inveDtion k to provide a computer 

m \ EX r T £ leS ,°, f f R0L ^ A°i lWa , re a xM MlcroStrate 8y modeling system and method which allows easy modeling 

Metacube™ and ^ of hypotheiical data for easy "what if ' analysis 

In both types of traditional OLAP, there is a database . - . ■ ■ - . . 

infrastructure, which is usually developed and maintained A * et farthcr ob J ect of the inventl0n 15 10 P^e a 

by an Information Technology (IT) department. Develop- 10 computer modeling system and method which allows the 

ment of OLAP models usually follows a cycle of model user t0 easil y review the formulas and rules which are the 

building, analysis based on the model, validation of the Dasis for calculation of model values, 

analysis and publication of results, which leads to further A still further object of the invention is to provide a 

refinements and additions to the model, and so on. In computer modeling system and method that can support a 

traditional OLAP, the IT department is usually involved with large number of users in a distributive manner, 

construction of the models, validation and publication of 3 M additional object of the invention is to provide a 

results, while the end user is involved only with the analysis ter modeli tem and method which 0Q the Qne 

phase of the cycle. This procedure is something of a neces- han(J aU()WS more ^ access model oonslruclio|1 and 

sity for most OLAP programs, since they are largely . .. . * orovides re cmiation of such 

designed to be implemented by personnel with at least some Ub6 ' . WtUCb , P ro ™. es SBa f G regulation of such 

knowledge of computer programming, such as a company's 20 «*ess. * a computer modeling system having at least one 

IT department. This leads to a certain isolation of the end da ' a stora S e mean \ a data Processing means, mput means, 

user (who may not be deeply computer literate) from the and out P ut means > lhe sic ^ comprising: 

models which he may be manipulating. The end user may Briefly, one preferred embodiment of the present inven- 

not be aware of certain unshared assumptions built into the lion is a method for creating multidimensional data models 

model. Additionally, the development cycle may be con- 25 or hyperstructures, which are to be contained in a computer 

stantly interrupted, sometimes for weeks or months at a memory. Measurements of physical objects and activities 

time, because the user must involve the IT department for which are related to the entity to be modeled in the computer 

even minor modifications of the models, and then for hyperstructure are obtained, and these measurements are 

publication of the results. transformed into computer data which corresponds to the 

A better type of OLAP would provide intuitive user 30 physical objects and activities external to the computer 

interfaces, which non-programmers could easily utilize to system. A plurality of independent dimensions is constructed 

build and modify models, and to publish results. This would within a hyperstructure from the computer data, where each 

return control of the modeling and publishing portions of the dimension has at least one element. A number of cells are 

cycle to the same end user who would perform the analysis, created, each of which is associated with the intersection of 

while the IT department could be concerned with maintain- 35 at least two elements, each cell being capable of storing one 

ing the database infrastructure. A user could therefore have or more values. This embodiment provides at least one rule 

an intimate knowledge of how a result is obtained, because domain associated with one or more cells, each rule domain 

he would have built the model himself and could determine including one or more rules for assigning values to the 

the assumptions made in the model construction. Better yet associated cells. A Domain Modeling Rule Set is prepared 

would be a system in which the formulas and rules utilized 40 which prioritizes the rules and determines which of the rules 

in calculating a value could be accessed and modified as will provide the values associated with each of the cells in 

needed to make the model more accurate, or a system in the hyperstructure. The application of the domain modeling 

which hypothetical values or assumptions could be intra- rule set causes a physical transformation of the data corre- 

duced to observe how the outcome would be affected. This sponding to the physical objects which are modeled in the 

kind of OLAP modeling in which control is distributed to the 45 hyperstructure. A "cell explorer" is also provided, which 

end-users, rather than being centrally controlled by an IT allows direct viewing of the applied formulas that produce 

department, shall be called a "Distributed OLAP" or a specific value for a cell. The preferred embodiment of the 

DOLAP for short. present invention also utilizes a unique query language, 

Consequently, there is great need for a computer on-line named HQL, for Hyperstructure Query Language, which has 

analytical processing and modeling system that can utilize 50 been developed for use with multidimensional models, 

large databases in a relational manner, and a system that at A second preferred embodiment is a method of querying 

the same time offers large-scale modeling, requires less a multidimensional computer modeling data structure or 

administrative attention to maintain, supports many intcrac- hyperstructure in a computer system having at least one data 

tive users and allows easy insertion of hypothetical values to storage device, a data processing device, an input device, an 

produce "what-if' analysis. 55 output device, a hyperstructure constructed in one data 

DISCLOSURE OF THE INVENTION storage device, a calculation engine, a Domain Modeling 

. . . , . . , Rule Set Preparation Module, a query engine, and an evalu- 

Accordingly, it is an object of the present invention to ator which communicates with an SQL generator, a math 

provide a computer modeling system and method which is library, a cache retrieval device and a sort and search 

not limited to use in small data sets. 60 pTOOtsSQL melhod incIudes lbe sleps of i nputting a 

Another object of the present invention is to provide a query, parsing the query; creating a query component tree, 

computer modeling system and method that does not require inputting model metadata into the Domain Modeling Rule 

pre-calculation of all data cells, and which allows data to Set Preparation Module, generating a Domain Modeling 

remain in database warehouses until needed. Rule Set from the Domain Modeling Rule Set Preparation 

Yet another object of the invention is to provide a com- 65 Module, inputting the Domain Modeling Rule Set into the 

puter modeling system and method which can integrate well calculation engine, and generating an execution tree with 

with relational databases. rules from the calculation engine. This is followed by 
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inputting the execution tree with rules to the query engine, FIG. 1 shows the basic components of the DOLAP system 

generating an optimized execution tree from the query in block diagram form; 

engine, inputting the optimized execution tree to the nG. 2 is an isometric view of a generalized datacube; 

evaluator, and communicating between the evaluator and , . . - . . . . 

any of a number of data reference devices. An execution tree 5 FIG ' 3 15 an * ometnc view of an exam P le dalacubc; 

with results is generated, the query results are packaged and FIG - 4 is an isometric view of an example datacube 

the results displayed in an output device. including several sub-cubes; 

A third preferred embodiment is a computer system for FIG. 5 shows a graph depicting an example of how an 

modeling a number of variables which includes at least one HQL query accesses data from a database; 

data storage device, a data processor, input and output 1Q FIG. 6 shows the relationship between DOLAP entities in 

devices, and a method for creating data models or hyper- block diagram form* 

structures having one or more independent dimensions p, G ? shows the ' imernal sub-systems of the calculation 

Each dimension has one or more elements and a number of . ... . . , * * z A " ^mauuu 

cells, each of which is associated with the intersection of at ""ft* llS re «* with other DOLAP sub-systems 

least two elements, where each cell is capable of storing one m D10Ck dia S ram torm i 

or more values. This embodiment provides one or more rule 15 FIG. 8 shows a flow diagram of the major steps involved 

domains associated with one or more cells, each rule domain m answering an HQL query; 

including one or more rules for assigning values to the FIG. 9 shows a flow diagram of the major steps in Domain 

associated cells. A Domain Modeling Rule Set is prepared Modeling Rule Set Preparation; 

which prioritizes the rules and determines which of the rules FIG. 10 shows the detailed steps involved in the process 

will provide the values associated with each of the cells. 20 of gathering the rules; 

An advantage of the present invention is that it can be FIG. 11 shows the detailed steps involved in the subrou- 

used with very large databases. tine for creating and adding the Time Summary Properties; 

Another advantage of the present invention is that it FIG. 12 shows the detailed steps involved in the subrou- 
operates in a relational manner with databases so that ^ tine for adding rules to the Rules List; 
pre-calculation of all data cells is not required. This greatly FIG. 13 shows the detailed steps involved in the process- 
reduces the set-up time for making the model and also mg 0 f autoformulas; 

reduces effort and storage requirements. FIG. 14 shows the detailed steps involved in the process- 
Still another advantage of the invention is that it is much 7 m g 0 f metric datamaps; 

less .com plicated to use t han the prior art, so that non- [ Q mG 15 shows the d( ! tailed steps mwo \ vt6 in lhe process 

programmers may use it to build and analyze models. 0 f ordering the rules- 
Changes to the model cad be easjjy ^e without having t o FIG. 16 shows the detailed steps involved in the process 

spend.a__gr. eat deal of tune rebuilding . of promoling the ru]es; ^ 

Yet another advantage of the invention is that the user who nG 1? shows ^ delailed involved {q ^ ^ 

conducts the analysis can also construct the model. This 35 • of mle con fli cts 

allows more direct involvement by the user in the specifics 

of how the model functions. It also allows the Information BEST MODE FOR CARRYING OUT THE 

Technology department of a company to concentrate on INVENTION 

maintaining the database, rather than administering the A preferred embodiment of the present invention is a 

model construction, as generally occurred in the prior art. 40 Distributed On-Line Analytical Processor (DOLAP) and a 

A further advantage of the invention is that it allows the method for using it. As illustrated in the various drawings 

user to easily review the formulas and rules that are the basis herein, and particularly in the view of FIG. 1, a form of this 

for calculation of model values. Thus the user can have a preferred embodiment of the inventive device is depicted by 

more intimate knowledge of the assumptions which have the general reference character 10. 

been incorporated into the computer model. 4S As is usual in systems employing computer software, 

A still further advantage of the invention is that it allows there is a great deal of terminology involved (including 

le ss interruption of the development cycle, since the user . acronyms) which is specific to the field. Some familiarity 

who-anah aes the data can easily make modifications to that I with basic terms is helpful to later discussions of the 
same mod el to-pradu£e-the-destf ed-resu Its. — I invention, thus the following terms relating to relational data 

A yet further advantage of the invention is that it allows 50 base model building in general, and to the present DOLAP 

easy modeling of hypothetical data for easy "what if* invention in particular, are introduced in the partial glossary 

analysis. below: 

An additional advantage of the invention is that although GLOSSARY 

a large number of users can be supported and data can be k „ 

easily shared between models, access can be restricted and 55 AUnbute--An attribute identifies additional information 

regulated in a secure manner available about elements at a particular hierarchy level 

Ihese and other objects and advantages of the present W * cb * an b * *? d 10 delermine which elemeDls ™gnt be of 

invention will become clear to those skilled in the art in view ? f aa ^ sis ' e 

of the description of the best presently known mode of ^h?*^ ^ lir ^ sel ° f data stored «" thc P0M « 

carrying out the invention and the industrial applicability of 60 D0LAP has n ° knowledge of the contents, but provides an 

the preferred embodiment as described herein and illustrated t0 . acan ' " Ie L ct ' read ' and write the BL0Bs ' D0LAP also 

in the several figures of the drawings. storcs lts worksheets as BLOBs. 

Cell address — A specification of one element from each 

BRIEF DESCRIPTION OF THE DRAWINGS of the dime nsions necessary to locate a cell in the datacube. 

The purposes and advantages of the present invention will 65 An address is expressed as (element | element | . . . | 

be apparent from the following detailed description in element). If an element is ambiguous between defined 

conjunction with thc appended drawings in which: dimensions, it is disambiguated as dimension .clement. 
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Column description — Describes properties of a column in 
a tabic used in a DOLAP model. Includes the column name 
and its datatype. 

Database description — Describes attributes of a database 
used in a DOLAP model, such as connection limitations and 
parameters. 

Dimension — A named set of elements, which express 
business information arranged in a network of sets (parents) 
and their constituents (children). The network is largely 
hierarchical, although some elements may appear in more 
than one group (have multiple parents). Dimensions are 
orthogonal to each other, such as time vs. geography. 

Dynamic element specification — A description of how 
dynamic elements are created from the database on demand. 
This process also creates qualifier datamaps for these ele- 
ments. 

Element — A distinctly named member of a dimension. No 
two elements in a dimension can have the same name. No 
element can have the same name as any dimension. 
Parent elements, in addition to providing a distinct slot in 
a dimension, group together other elements, referenced 
in their formula rules, called children. Children may 
have multiple parents in the same dimension, meaning 
the children belong to more than one group. 
Formula rule — A rule describing how certain cell values 
are calculated. A formula rule may consist of a rule domain, 
which identifies the cells to which its result applies, and a 
means by which it gets values from other cell values or 
constants. See the glossary entries for each of the types of 
formula rules for details on how this works. 
There are three types of formula rules: standard formula 
rules, custom formula rules, and collection formula 
rules. 

Group — A named set of users who share the same access 
rights to a specific set of models. Users can have additional 
access rights. 

Hyperstructure Query Language (HQL) — is used for 
re questing_cells in the datacube, fetching lists of qualifyin g 
ekmea ts^Qaptionally with their selected attribute values), or 
fetchin g lists of unique attribute_values. 

Info space — A model builder-defined tuple specification 
where rules exist to determine a value for most of the cells 
in the tuple. (There may be some relatively small "sub- 
spaces" where rules do not exist. For example, the user may 
have a list of revenue sources, with no associated revenue 
values, and thus no rules, associated with a grouping parent 
in a financial statement.) The purpose of an info space is to 
support DOLAP client applications' ability to steer the user 
toward areas of the model's datacube which contain mean- 
ingful information. 

Column Join — Standard RDDMS term; a link between 
one column in each of two tables. 

Table Join — The set of all column joins between two 
specified tables, which define the one and only link between 
them. Additional attributes include the "directionality" of 
the join specified by indicating the one/many to one/many 
relationship between rows in the two specified tables. 

Join Path — Aspecific sequence of table joins that DOLAP 
is to use to traverse between two tables. Assigned automati- 
cally by DOLAP; may be overridden by the administrator. 

Metric database mapping rule — A rule associated with a 
metric-type element describing how and where information 
exists in and is retrieved from the database. 

A metric database mapping rule consists of the element 
the rule is attached to and the query information, which 
describes how to retrieve the information into the result 
cells. 
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Model — A named, file-like package consisting of: 

* Dimensions, elements, and rules 

* Blobs to store third party data and worksheets 

* A set of user entered values 

* Information spaces 

Persistent Object Manager — The permanent storage 
mechanism and associated software modules which store 
and retrieve all the information associated with models 
10 managed by a DOLAP server, including the models, 
DOLAP schema information, and user and group informa- 
* lion. 

Qualifier database mapping rule — A rule associated with 
a qualifier-type element that includes a WHERE clause 
15 describing how the qualifier can be used to qualify queries. 
A database mapping rule consists of 1) a rule domain 
(presented as a rule domain in the UI) which identifies 
the cells to which its result applies, and 2) WHERE 
clause information. 
20 Qualifier datamaps are created automatically by the 
dynamic element generation process. 
Rule domain — A description of the part of the datacube 
for which a formula can be used to calculate cell values. 
Schema — A set of persistent objects that specify how 
25 information is stored in databases, tables and their compo- 
nent columns. 

Schema object — Defines database information. Schema 
information is shared by all DOLAP Models. 
A given DOLAP system (i.e. DOLAP Server) has exactly 
30 one schema object which is edited with the Schema 
Browser. 

Table description — Information stored in the Schema 
object about tables used by DOLAP models. 

Tuple Specification — A description of some cells in the 
35 datacube, consisting of between one and N element-list 
terms, each corresponding to a different dimension, where N 
is the number of dimensions in the model. 

User — A named person who has the rights to log on to a 
DOLAP server and a security password. 
40 UEV set — The set of all user entered values in a model. 

User Entered Value (UEV) — A single number which is 
created (or modified) when a user manually types or pastes 
a value into a cell on a worksheet. The UEV consists of an 
address, which corresponds to the address of the cell on the 
45 worksheet, and a value. 

UEVs are commonly used to override numbers for what-if 
analysis and to enter and edit values for business drivers. 
"Constant" formulas arc handled specially by the DOLAP 
processor. 

50 DOLAP is a model-based client-server decision support 
system for multi-dimensional analysis of data stored pre- 
dominantly in relational Data Base Management Systems. 
FIG. 1 illustrates the basic elements of the inventive DOLAP 
system 10. At the broadest level, the DOLAP system 10 

55 includes the DOLAP server 12, the DOLAP client 14, and 
the customer supplied databases 16. 

The server 12 stores little or no end-user data, but rather 
retrieves that data from databases 16 on behalf of a client 
user/workstation 14, manipulates the numbers according to 

60 a specific business "model", and delivers the results of those 
calculations to the client 14 over a LAN or WAN. DOLAP 
caches some data for faster querying and calculating. The 
cache may be shared between users to improve performance. 
The DOLAP server 12 includes a calculation engine 18 

65 which handles the retrieval and caching of data and com- 
putation of cell values for the model. The DOLAP server 12 
provides results to all queries, calls out to the databases 16 
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to get values and performs calculations in memory in order 
to satisfy the queries, and caches values in memory so that 
subsequent queries requiring these values can use the pre- 
calculated values already in memory. The calculation engine 
18 is divided into a number of cooperating subsystems and 
data structures, augmented by several auxiliary subsystems 
which will be discussed below. 

The server 12 further includes a persistent object manager 
20 which is the permanent storage mechanism and associ- 
ated software modules which store and retrieve all the 
information associated with models managed by a DOLAP 
server, including the models, DOLAP schema information, 
and user and group information. 

The server 12 also includes a server administration sec- 
tion 22, a client request manager 24 and a database interface 
26. Additionally, local storage 28 and interfaces for Open 
DataBase Connectivity (ODBC) 30 and Remote Procedure 
Call (RPC) 32 support can be obtained off the shelf for use 
with the DOLAP system 10. 

The DOLAP client 14 is any program used by a user on 
a workstation that communicates with the DOLAP server 
12. The client 14 runs on a medium -sized Windows 95™ or 
NT 4.0 platform (minimum 12 MB RAM, VGA), and 
provides API-level access to data supplied by the DOLAP 
server. In addition, the DOLAP client provides user interface 
tools that support (1) the administrative and model-building 
functionality needed by all DOLAP applications (including 
those developed by third parties) and (2) modest end-user 
analysis and deployment functionality. 

The DOLAP client 14 includes a client library 34, and its 
own RPC support 36. The client library has no User Inter- 
face (UI) but provides a C and Visual Basic™ compatible 
Application Program Interface (APT) 38 which supports a 
continuously on-line connection to the server 12 using 
Transmission Control Protocol/Internet Protocol (TCP/IP). 
The client library 34 is designed so that multiple client 
applications which access the same server 12 can be running 
at the same time. The client API 38 can interface with an 
administration program or other client application 40 or with 
a web server (not shown) which may be connected through 
the Internet or an Intranet to a web browser. The client 
library 34 may also have a separate Object Linking and 
Embedding (OLE) API conversion library 42 that uses the 
client library 34 and may provide an OLE-style API to third 
party tools 44 such as Excel™ or applications written in 
Visual BasicTM or PowerBuilder™. 

The DOLAP clients 14 are connected to the server 12 by 
means of communications links 46. Customer supplied 
databases 16 such as Teradata, Sybase 10, Oracle, Informix 
and others communicate with the server 12 by means of 
communications links 48. 

A brief discussion of multidimensional model or hyper- 
structure construction may be helpful in understanding fea- 
tures of the present invention. FIG. 2 shows a generic data 
structure or computer model 50. A DOLAP model consists 
of a number of components built, maintained, and used by 
the DOLAP Server. A model's purpose is to present to a 
client application an accurate view of the structure of a 
desired part of a business, and to support the calculation and 
analysis of the data present within that structure. In a 
simplified form, a model consists of the model structure, 
storage for data (BLOBs — including worksheets and data 
from third party applications), an optional set of user-entered 
values specified by a client user (what-if values and business 
drivers, such as a commission rate) and zero or more 
information spaces (defined areas of the model that contain 
information). The resulting model can be visualized as a 
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datacube 52 composed of n number of dimensions 54, each 
of which includes one or more elements 58 whose intersec- 
tions define a plurality of cells 56. Although three dimen- 
sions 54 are illustrated here, the number of dimensions 54 is 
5 not limited. For this reason, the datacubes may also be 
referred to more generally as "hyperstructures 51". A special 
case of the general category of hyperstructures is defined by 
the term "hypercube", a term which refers to a cube having 
more than three dimensions. However, it should be under- 

10 stood that the terms "datacube" and "hypercube" as used 
here do not imply that the model is limited to structures 
having equal length sides. 

A rule domain tells DOLAP where a rule can be applied. 
A generalized rule domain is illustrated by the reference 

15 character 60. There are many possible rule domains in a 
particular model, and it is quite possible that a single cell can 
be located within two or more rule domains simultaneously, 
as seen in FIG. 2. Here, two rule domains 56, both of which 
are emphasized by hatched shading, intersect to include a 

20 cell 62, which is emphasized by cross-hatching. In estab- 
lishing the value which will ultimately be calculated for the 
cell 62, it is then necessary to prioritize the rules that apply 
and resolve conflicts between rules. These processes will be 
discussed in detail below. 

25 Dimensions 54 represent the fundamental characteristics 
of the business area being modeled, and for the purposes of 
a model are assumed to be orthogonal to each other. Typical 
dimensions include Finance, Time, Scenario, Organization, 
Geography, Products, Customers, Competitors, and so forth. 

30 Elements 58 are the specific items associated with each 
dimension. Using the examples above, the Time dimension 
might have the elements 1994, 1995, 1996 and the set of four 
quarters for each of those years. The Organization dimen- 
sion might be composed of the specific names of a holding 

35 company's subsidiaries, and the Geography dimension 
might consist of World-wide, US, Europe, and Asia, with the 
US being subdivided into North, West, Southeast, and 
Northeast regions, and so forth. Each cell 56 in the hyper- 
structure 51 can be addressed by specifying an element 58 

40 from each dimension 54 in the model 50. In general, the 
elements 58 within a dimension 54 are structured into a 
hierarchy of parent and child elements. The subdividing of 
the Geography region described above is an example of the 
hierarchical relationship among elements. DOLAP hierar- 

45 chies are not strict hierarchies, because elements can have 
multiple parent elements. 

There arc two types of elements 58 which can be added 
to the model 50. Metric elements represent quantitative 
values like money, inventory, units of sales, and so on. 

50 Qualifier elements represent lists of things such as 
customers, regions, cost centers, and so on. The combination 
of these two types of elements yields interesting analysis 
possibilities such as Revenue by Customer, or Expense by 
Cost Center. Most elements have rules associated with them 

55 that tell DOLAP how to get values for the cells. These rules 
can be either data maps or formulas, as described below. 
Some elements can be generated automatically based on 
information in the database. These elements are called 
Dynamic Elements. In typical DOLAP installations, many 

60 of the elements to be included in the model, or used as part 
of the model analysis, are represented by entries in database 
tables. In some cases, the hierarchy associated with the 
elements can be deduced and extracted from the database. 
Instead of the model builder manually adding these elements 

65 in their hierarchy and their data maps to the model, DOLAP 
can create the elements dynamically by using the Dynamic 
Elements option in the Modeler's Workbench. Because 
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DOLAP extracts the elements from the tables, it can also 
automatically construct the data maps for the elements. For 
example, the user may want lo add elements for each of his 
customers. If he has a large customer database, he would not 
want to type in all of the customers manually. 5 

Levels of dynamic elements are created by specifying 
data mapping information for each level. For each hierar- 
chical level, the data mapping information defines where in 
the database to get elements for that level of the hierarchy. 
When the user dynamically generates a hierarchy as part of 
his model, DOLAP saves the elements in the intermediate 
levels of the hierarchy with the model. (They are persistent). 
For the lowest level of the hierarchy (the leaf-level), the user 
can specify whether or not to save the elements in the model. 
(They can be persistent or transient). For example, if the user 
has a million customers, he might want to analyze the 15 
customers, but he would not want to store them in his model. 
When DOLAP needs the customer information during the 
analysis, the information can be retrieved at that time. 

Often, the database tables that are used to create the 
dynamic elements contain other information that is relevant 20 
to, or associated with, the element. For example, the data- 
base table used to create a customer hierarchy could also 
contain the street address, city, state, zip code, and phone 
number for each customer. These are called attributes of the 
element. The user can specify attributes for each level of a 25 
dynamic hierarchy so be can easily access or analyze infor- 
mation related to the dynamic elements. 

Although the DOLAP API will support a variety of 
presentations of the models structure and data, the DOLAP 
client application primarily utilizes treeviews to examine 30 
and manipulate the model's structure, and a worksheet to 
view data. The DOLAP Worksheet presents a multidimen- 
sional spreadsheet view of any portion of the datacube, and 
allows the user to enter what-if values and business drivers. 
It supports a variety of standard spreadsheet operations, plus 35 
others appropriate for multi-dimensional viewing such as 
pivoting across dimensions and drill-down, -up, and -across. 
Also, users of the DOLAP Manager application can send 
their data to a graph window driven by the First Impres- 
sions™ custom control for graphs. 40 

FIG. 3 illustrates a typical example utilizing the three 
dimensions of Time 63, Organization 64, and Finance 65, 
with the following structure: Time dimension 63 contains 
elements 1994, Ql-1994, Q2-1994, Q3-1994 and Q4-1994; 
Organization dimension 64 contains elements Amalgamated 45 
(a holding company), Acme (subsidiary), Bingo 
(subsidiary), and Coronet (subsidiary); and Finance dimen- 
sion 65 contains elements Profit, Revenue and Expenses. 

r lhe DOLAP architecture utilizes a rule-based methodol- 
ogy which provides strong business modeling capability 50 
when dealing with varying levels of dimensionality. This 
allows for a transparent use of a multi-cube approach to 
business modeling. Model builders can use any number of 
dimensions from the data cube. (Other OLAP products only 
support a single pre-defined hypercube or multiple 55 
hypercubes, which limit model -building flexibility, forcing 
model builders to build models based on the number of 
dimensions in the data cube.) DOLAP also supports the 
ability to limit levels of dimensionality by using a Not 
Specified (NS) condition in the rule domain for an element's 60 
formula or data map. 

FIG. 4 illustrates a hyperstructure 51 from which a 
sub-cube 66 has been extracted. An element in the Time 
dimension 63 has been declared NS, as also has an element 
in the Organization dimension 64, resulting in a second 65 
sub-cube 68, and a third clement in the Finance dimension 
65 has also been Not Specified resulting in sub-cube 70. 
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To obtain the values for cells in the model, DOLAP uses 
two types of rules: 1) formula rules which tell DOLAP to 
compute the cell value mathematically from other cells in 
the model, for example, Revenue is calculated by multiply- 
ing Price times Units Sold; and 2) data map rules which tell 
DOLAP to retrieve the cell value from the database. 
Additionally, DOLAP uses a user-entered value (UEV) if it 
exists. 

Formula rules, which are attached to elements, also have 
associated with them rule domains that define where in the 
datacube the rules can be used. Formula rule types include 
standard formulas, custom formulas and collections. Stan- 
dard formulas take a list of elements and apply an aggre- 
gation function. These functions include Sum, Average, 
Min, Max, and Count. Custom formulas let the user create 
his own formula comprised of operators and functions 
applied to elements. Standard formulas can also be used 
within a custom formula. Collections, which are not actually 
computation formulas, let the user specify a collection of 
elements to be considered children of the element that 
specifies the formula. These collections are used for 
reporting, rather than for computing. For example, the user 
may want Revenue and COGS listed as children but not 
necessarily used in a single formula that calculates the value 
of Income statement. The user can create a collection 
(children of Income statement) so the user can drill-down 
and view them in a worksheet. 

In general, formula rules utilize elements from the same 
dimension as the element whose value they are computing, 
but there are cases where a formula will compute a value 
based on elements in other dimensions. For example, in a 
planning model, the model builder may want to compute the 
revenue for a particular geographic region as a percentage of 
the revenue of another geographic region. 

Data map rules tell DOLAP where in the databases the 
raw data used in the calculations is found. These rules 
indicate a database and table where element-associated data 
is found, and may have a SQL WHERE clause restricting the 
data that is returned to just those rows in the table associated 
with the element. Data map rules are either metric or 
qualifier mappings. Metric data map rules indicate where 
data for the element is contained. Qualifier data map rules, 
for qualifier elements such as time, include information that 
DOLAP eventually turns into a SQL WHERE clause that 
identifies the database table rows containing data that 
applies to the element. The calculation engine uses this 
information, in combination with information from the other 
dimensions, to help determine which table must be accessed 
to obtain data for the model. The datamapping for a qualifier 
clement will have been created by the Dynamic Element 
Generation process, which created the hierarchy in which 
the qualifier element exists. 

Conceptually, when DOLAP needs a value for a cell 
whose value comes from the database, it scans the elements 
that form the cell's address for their data map rules. (To 
improve performance, the actual algorithm deals with 
groups of cells.) DOLAP determines which of the possible 
database tables is the correct one and formulates a SQL 
query to retrieve the data. 

FIG. 5 shows an example of a data retrieval. The Hyper- 
structure Query Language (HQL) is used for fetching cells 
in the datacube, fetching lists of qualifying elements 
(optionally with each element's selected attribute values), or 
fetching lists of unique attribute values. Although the subject 
will be covered in more depth below, basically the contents 
of cells are specified by the rules attached to the elements 
whose intersection locates the cell. Switching to an element 
in an orthogonal dimension is depicted by separating vertical 
lines. 
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In this case, the HQL query is to locate the revenue from 
Bingo Corporation for the second quarter of 1995. In SQL, 
this becomes SELECT Rev FROM Sub_Qtr_Jin WHERE 
Orgo"B" AND Qtr-"2-95'\ In this example it is assumed 
that Acme and Bingo arc subsidiaries of Amalgamated in the 
dimension of Organizations, and we want Revenue from the 
Finances dimension for Q2-95 from the time dimension. 
This information is stored in a database called Amal_ 
Warehouse from which the appropriate data value for the 
required cell may be accessed. Note that in the diagram the 
data map for "Revenue" is the metric mapping rule that 
retrieves the actual cell value. The data maps for "Bingo" 
and "Q2-95" are the qualifier mapping rules that specify the 
rows from which to get the values. 

Fundamental to DOLAP' s ability to handle large, com- 
plex models is the concept of rule domains. A rule domain 
tells DOLAP where a rule can be applied, and is evaluated 
in determining which of the possible rules to apply. 
Technically, the rule domain for a formula rule is the entire 
left-hand side of the formula. Most of the time, a formula 
applies to a single element as in the following example: 

Gross Margin«Revenue — Cost of Goods Sold 

Here, Gross Margin, the element to which the formula is 
attached, is calculated using the same formula throughout 
the model (for example, for all divisions and all products). 
It is also the formula's (degenerate) rule domain. This occurs 
so often, in fact, that throughout the specification, the term 
"rule domain" will generally refer to element specifications 
in dimensions additional to that of the element to which the 
formula is attached. For example, 

1996|Rcvcnue=1995 | Revenue ■ 1.25 

In this case, since the formula is being attached to the 
Revenue element, the explicit rule domain is "1996". The 
complete, implicit rule domain is 1996 | Revenue. 

There are cases where a single formula can not be applied 
so broadly. For example, revenue might be computed quite 
differently for past, current, and future years. If past years' 
revenues are the actual revenue values extracted from finan- 
cial systems databases, the current year's revenues are a 
forecast pulled from a sales forecasting system database, and 
future years' values are calculated based on past and current 
trends, the user would need to create different formula rules 
for each type of Revenue: 



Rule domain 


Rule 




Rule type 


1994 
1995 
1996 


Revenue - 
Revenue - 
Revenue - 


wa rehouse.actuals .rev 
warchouse.forccast.rcv 
1995 Revenue* 1.25 


data map 
data map 
formula rule 



^ When DOLAP cal culates a revenue cell valu e T it selects 
the rule to use based on the year value associated w ith each 
particular cell. When computing values for cells, DOLAP 
generally has a variety of methods by which a cell is 
calculated. Where formulas are composed of commutative 
operators, such as addition or subtraction without constants, 
the order of operations is unimportant. However, for formu- 
las including operations such as multiplication and division, 
the order of operations Ls important. The DOLAP design 
incorporates rule prioritization to address this problem while 
exploiting rule generality as much as possible. To determine 
how to compute the value for a cell, DOLAP scans all the 
rules that could apply to the cell and selects one according 
to its rule precedence algorithms. In general, DOLAP gives 
preference to rules with more specific rule domains, and to 
rules specified by the model builder, as opposed to rules 
inferred by database schema. 
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A variety of user interface tools are provided. Model 
Builders use the Library Explorer portion of the client 
application (to create new models or open existing models) 
and the Modeler's Workbench (to create the structure of a 

5 datacube). The Library Explorer displays the contents of the 
server's repository using a split window to manage its 
objects. A Schema Explorer allows users with administrative 
privileges to specify what databases are to be accessible to 
the models in a particular library. A SQL Audit facility 

10 allows a user to audit HQL queries that are sent from the 
client to the server and view the series of SQL queries that 
were generated by the Calculation Engine in the fulfillment 
of the HQL query. A UserEnteredValue List Window pro- 
vides a means by which a model user can see which user 

15 entered values have been entered into the datacube and 
delete any which are no longer desirable. A Worksheet is a 
multidimensional view of a subset of the datacube. It allows 
custom views of the model to be created and UEVs to be 
entered. 

20/j^FIG. 6 illustrates the relationships between entities 
involved in a DOLAP modeling system 10. The model 50 is 
a named file -like package, whi ch _rec ejve inputs from many 
sources. The user 72 is a named person who has access rights 
toThe model 50, either individually or as part of a group 74. 

25 The persistent object manager 20 holds the DOLAP models 
50 as well as meta-data used by the models managed by the 
server 12. It also holds BLOBS 76 that are arbitrary data 
streams stored on behalf of a third party DOLAP client 
application which are associated with a particular model. A 

30 group of related functions have been shown as a model 
structure and UEV module 78. A group of model data maps 
80 contains database-mapping rules that are attached to the 
model 50. This includes a group of rule domains 82. An 
element 58, which is included in a dimension 54, can receive 

35 input from formula rules 84, which are also included in the 
model 50. The formula rules 84 include the rule domain 
section 86, which provides a description of the part of the 
datacube 52 (see FIG. 2) for which a formula can be used to 
calculate cell values. The rule domains 86 are also included 

40 in the model tables 88. Certain elements, d ynamic parents . 
havejdvna micTelement specifictions which are used to gen- 
er ate hierarchies of elements under t he dynagrfer parerii s. The 
PES s pecifies data base tables whose columnrcqntain key 
values and.namejyfor.tbe^ele^ebtsrin The hieraxchy. Element 

45 5onSr7~also obtain input from either the metric database 
mapping rules 92 or the qualifier database mapping rules 94, 
which tell DOLAP where in the databases the raw data used 
in the calculations is found. The metric data base rules 92 
contain rules associated with a metric-type element which 

50 represents the "what" part of the name or address of a cell 
56 in a d atacube 52 (see FIG. 2). Generally the metric rules 
92 associate a measure such as Revenue with a numeric 
column. The qualifier data map rules 94, for qualifier ele- 
ments such as time, include information that DOLAP even- 

55 tually turns into a SQL WHERE clause that identifies the 
database table rows containing data that applies to the 
element. The calculation engine 18 (see FIG. 1) uses this 
information, in combination with information from the other 
dimensions, to help determine which table to access to get 

60 data for the model 50. The datamapping for a qualifier 
element will have been created by the dynamic element 
generation process that created the hierarchy in which the 
qualifier element exists. 
A User Entered Value (UEV) 96 is a single number which 

65 is created (or modified) when a user manually types or pastes 
a value into a cell 56 in a model 50 (sec FIG. 2). This UEV 
consists of an address (which corresponds to the address of 
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the cell) and a value. These UEVs 96 supersede the values data in the databases through the ODBC 30. The results are 

calculated from rules at the same level of specificity (unless sent to the result extractor 134 which communicates with the 

there are other more specific rules) and are used by DOLAP query result cache 124 and the rule evaluator 128. The 

to calculate jfa e^values Jbr other cells whose values depend results may then be sent to the client request manager 24, 

onjhem. UEVs 96 are commonly used to override numbers 5 and subsequently to the client application, 

for what-if analysis and to enter and edit values for business The steps involved in this process are shown in FIG. 8 as 

drivers. The UEVs 96 are stored in a UEV set 98 which is a generalized flowchart of DOLAP operations in response to 

included in the model 50. a user query. 

A cell address 100 specifies one element from each of the The steps of the flowcharts to follow may be implemented 

dimensions necessary to locate a cell 56 in a datacube 52 10 by one or more software routines, processes, subroutines, 

(see FIG. 2) and can be used' to specify UEVs 96. modules, etc. It will be apparent that each flowchart is 

Only a tiny fraction of the cells 56 in a model 50 actually merely illustrative of the broad logical flow of the method of 

represent meaningful information. Users and client applica- the present invention and that steps may be added to, or 

lions attempting to navigate the datacube can frequently taken away from, the flowcharts without departing from the 

benefit from guidance as to where the useful information 15 scope of the invention. Further, the order of execution of 

resides in the datacube 52 (see FIG. 2). The DOLAP Client steps in the flowcharts may change according to different 

Application lets the model builder create and maintain an implementations such as interrupt-d riven, polled, etc., 

infospace 102 that defines a set of dimension and default event-handling. A multiprocessing or multitasking environ- 

element combinations which in turn define useful portions of ment could allow steps to be executed concurrently. For ease 

the DOLAP datacube. 20 of discussion, the implementation of each flowchart is 

The schema 104 is a set of persistent objects stored in a referred to as if the flowchart were implemented in a single 

repository that specifies how information is stored in "routine". 

databases, tables and their component columns. Schema 104 The method 200 starts as a Hyperstructure Query Lan- 

contains the physical table configuration and relationship guage (HQL) query 202 which is passed to a parser 122 

between tables of data. The schema object 106 contains the 25 which converts the HQL text to a query component tree 204 

descriptions of the columns 108, tables 110, databases 112, which represents the component parts of the query 202. This 

and servers 114 corresponding to the contents of the data is sent to the calculation engine 18. The model metadata 206 

warehouses 16. is the information about the structure of the model 50 (see 

Columns in specified tables can be linked or joined FIG. 2) which has been previously constructed. These are 

together. The join path 116 is the preferred join out of all 30 provided to Domain Modeling Rule Set Preparation 208 

possible joins between any two tables in the databases of a which generates the domain modeling rule set 126, which 

DOLAP schema, and may be the default or user-set path. are then supplied to the calculation engine 18, which takes 

The purpose of DOLAP 10 is to retrieve data from one or the applicable rules and adds them to the query tree 204 to 

more databases 16 (see FIG. 1), perform calculations as produce an execution tree with rules 212 for the query 

specified by the model 50, and return values for specified 35 engine 132. This query engine 132 produces the optimized 

cells 56 (FIG. 2) to users via client applications 40 (FIG. 1). execution tree 214 by combining queries and delegating 

The core part of this functionality, the retrieval and caching calculations to the database server, whenever possible. The 

of data and computation of cell values for the model, is optimized execution tree 214 is passed to the evaluator 128 

handled by the calculation engine 18. The calculation engine which decides whether further data is required from a 

18 is divided into a number of cooperating subsystems and 40 relational database. If further data is required, evaluator 128 

data structures, augmented by several auxiliary subsystems. communicates with a Relational DataBase Management 

FIG. 7 illustrates the components of the calculation engine System (RDBMS) 216 through an SQL generator 218. 

18. It communicates with the client request manager 24, Evaluator 128 can also communicate with a math library 

from which the calculation engine 18 receives queries, and 220, if a calculation is required, or a sorting and processing 

to which it returns results in response to these queries. The 45 system 222, if the process requires ordering of results or 

engine 18 also communicates with the persistent object sorting in some manner. In order to save processing time, 

manager 20 which contains the model 50 and the schema values which have been previously calculated may be 

104. Additionally the calculation engine 18 communicates obtained from cache retrieval 224 which accesses a multi- 

with the database interface 26 that includes a query dis- dimensional cache 226. The execution tree with results 228 

patcher 120 and an Open Data Base Connectivity (ODBC) 50 then proceeds and, if it is done 232, it may be sent to result 

portion 30. An incoming client request is routed to an HQL packaging 234 for eventual output of the query results 236. 

parser 122 which transforms the HQL query from its textual The query 202 may include an inner query that must be 

form into an internal representation. After the internal rep- processed before the query as a whole can be processed. In 

resentation of the query is made, the calculation engine 18 this case after one pass through the process, when arriving 

looks in the query results cache 124 to see if the query (or 55 at the decision box, the result will be not done 230, and the 

any portion of it) can be retrieved from the cache in order to process will continue around the loop again until completed, 

save processing time. When following the previous process, it is important that 

A Domain Modeling Rule Set 126 r eceives input from the system be able to determine which formulas apply to a 

bo th the model 50 and the schema 104, and providesTn put certain cell. If there are multiple formulas that apply, there 

t ojhe rule evaluator 128 . which then communicates with the 60 must be a hierarchy of rules established so that the correct 

execution plan 130 and the query engine 132. The query formula is used and, consequently, the correct value is 

engine 132 creates SQL queries from HQL queries and assigned. This hierarchy of rules and the associated priorities 

optimizes these SQL queries by combining them (whenever involved are very important to Domain Modeling, 

possible) and also optimizes the performance of the calcu- ^ The Domain Modeling Rule Set 126 (see FIG. 7) is a - 

lation engine 18 by performing as many calculations as 65 run-time data structure used by the calculation engine 18 to 

possible with SQL statements executed by the database. The resolve requests. Domain Modeling Rule Set 126 is prepared 

SQL is sent to the query dispatcher 120, which accesses the from the model description and embodies DOLAP's knowl- 
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edge of all rules within the complete datacube 52 defined by Set Preparation process 300 is to gather the rules 400 from 

all the elements 58 of all the dimensions 54 in the model 50 the model and convert them to the uniform representation 

(see FIG. 2). The structure is created by calculating the ' that will be used by the calculation engine 18 (FIG. 8). Next, 

effective Rule Domain for each metric database map, UEV, the autoformulas are processed 500, followed by processing 

and formula rule in the model, and prioritizing them so that 5 the metric datamaps 600. The rules are then ordered 700, 

the proper rules will be executed. Domain Modeling Rule certain of the rules are promoted 800, and conflicts between 

Set Preparation is triggered by the first HQL re^ucstafter the rolcs are processed 900. Then Domain Modeling Rule Set 

model is opened or after the model is changed, unless the Preparation is then done 995. 

only changes were UEV value changes or if user requests to Tf 1 ? ™* T enters UEVs ex P uclU y> 50 their conversion is 

see the^ule for a cell. ~ 2 . , 0 straightforward. The elements that are specified in the 

^mModelinlRule Set Preparation is the process of addre f . are £c domain Ibr UEVs. For dimensions not 

gathering all of the Tules from the model, turning them into T ? " ^ ^ * ^T* 

im »• .u . u *u ii.- o elements in those dimensions, including the Not Specified 

valid actions that can be used by the calculation engine, and ( N / S ) e i emem 

prioritizing them so lhat I me proper rules take precedence. In Formulas are also entered by the user explicitly, and are 

the process, Domain Modeling Rule Set Preparation must 15 attached t0 the e i ements . Th e mXc domains are initially set 

check for conflicts in the rules, i.e. situations where it is t0 lhe mle domain specified by the user, with the rule domain 

ambiguous which rule should take precedence in a case in thedimension of the element containing just that element, 

where the results of applying the two rules would yield List formula rule domains require additional processing, 

different results. Domain Modeling Rule Set Preparation All the elements that have an attribute "do not aggregate" are 

supports an Application Program Interface that can specify 20 removed from the rule domains of the list formula rules, 

for a client which specific rule was/would be used to which would otherwise be used to compute the "do not 

calculate the value for any cell in the model. aggregate" element. The one exception is that if the element 

Rules drive all computations within the DOLAP Server. that this formula is attached to is marked with "do not 

They are stored in the model as UEVs, formulas and aggregate", DOLAP will not remove this element from its 

datamaps and are composed of two parts, 1) a rule domain 25 own formula's rule domain. 

and 2) an action. FIG. 10 shows the general step of gathering rules 400 in 



The rule domain specifies the cells whose values can be 
computed by the rule's action. The rule domain is 
rectangular, i.e., it is a sub-cube of thejnodel dat acube. The 



more detail. A list of rules having "Do Not Aggregate" 
properties is built 405. The next rule is fetched 410. If there 
are no more rules to fetch 415, the routine is done 490. If not, 



rule domain is restricted Jo. t he jul e domain as specified by^jo the rule is tested to see if it is a datamap 420, an autoformula 
the user, but additionally r educed a s described below. 430, or an aggregation 440. Datamaps are saved in a 
DOLAP supports three different kinds of actions: UEVs, Datamap List 425, autoformulas are added to an Autofor- 
datamaps and formulas. UEVs are the simplest rules within mula List 435. If the rule is an aggregation, a subroutine to 
DOLAP. The action of UEVs is simply to supply a constant Add/Create Time Summary Properties (TSP) Rules 450 is 
value, while the UEVs address is used as their rule domain. 35 called. The details of this subroutine are seen in FIG. U, to 
Formulas are attached to specific elements in the model, be discussed below. If the rule is neither a datamap, auto- 
with the element forming an implicit part of the formula's formula or aggregation it is added to the Rules list 460 
rule domain. The formula expression itself is the action. through another subroutine, details of which are seen in FIG. 
Datamap rules map a cell value to the value from a column 12 below. The routine loops back to step 410 to get the next 
in a database table. Essentially, a datamap rule is a SQL 40 rule, and this loop is repeated until the last rule has been 
SELECT statement, specifying which rows from a table processed 415, at which time the routine is done 490. 
have to be selected and which column from the selected The time summary property requires special support 
rows contains the value. during Rule Gathering. Two inputs create these time sum- 
Additionally, there are autoformulas which are default mary rules: 1) list formulas in the time dimension that have 
rules which describe very generic processes, for example, 45 their type set to Aggregate, and 2) elements that are not in 
"the vaJjic^c£^par^LequaJ^the^suji Lof its chil dren". These the time dimension that specify a time summary property 
"unpromoted" autoformulas are th e lowest^ level in priority other than "None". The first of these inputs contains a rule 
of application unless their priority has been adjusted higher domain and operands, but no operator; the second defines 
to "promote" them. The unpromoted autoformulas only the operator (i.e. sum, avg, min, max, first, last). The two in 
apply where higher priority rules have not been imposed and 50 combination form a complete rule. The Not Specified ele- 
can therefore be discarded from the prepared rules. A second ment (N/S) will have "None" for its time summary property, 
round of rule promotions will cause additional autoformulas For every such pair where the element with the time sum- 
to bej^tained, where the autoformula refers to a value that mary property is within the rule domain of the aggregation 
will be retrieved via a metric data map. Unpromoted auto- formula, Domain Modeling Rule Set Preparation will create 
formulas for dynamic elements are created from generic 55 the rule and will adjust its rule domain to contain all the 
autoformulas which are generated from dynamic element elements with the time summary property in that element's 
specifications. dimension. The other dimensions will remain unchanged. 

FIG. 9 is a flowchart which shows the general steps FIG. 11 shows details of the Add/Create TSP Rules 

involved in Domain Modeling Rule Set Preparation. The subroutine 450, which is called from various points in the 

smaller steps involved in each general step are shown in 60 whole program. The next TSP in the model is fetched 452. 

greater detail in FIGS. 10-17. The overall process of If there are no more TSPs 454, the subroutine is done 458. 

Domain Modeling Rule Set Preparation is referred to by the If not, a new rule is created for that TSP which is derived 

reference number 300. In referring back to FIG. 8, this from the original formula 456 according to the procedure 

process takes place in Domain Modeling Rule Set Prepara- described in the paragraph above. Next, the subroutine is 

uon 208, and the end product is the domain modeling rule 65 called to add it to the Rules List 460 (see FIG. 12). The 

set 126, which arc input to the calculation engine 18. program loops back to 452 to get the next TSP until all have 

Returning to FIG. 9, the first step in Domain Modeling Rule been processed and the loop is done 458. 
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FIG. 12 illustrates details of the Add to Rules list 
subroutine 460, which is called from different points in the 
program. The rule is tested to see if it is a list formula 462. 
If not, it is appended to the Rules List 474 and the subroutine 
is done 476. If it is a list formula, the next element in its rule 
domain is fetched 464. If there are no more elements 466, 
the rule is appended to the Rules List 474. If not, the element 
is checked to sec if it has a "Do Not Aggregate" property 
468. If it does not, the next element is fetched 464. If it does, 
the element is checked to see if it is in the dimension of the 
element to which this rule is attached 470. If yes, the next 
element is gotten 464, if no, the element is removed from the 
formula's domain 472 and the next element is gotten 464. 
The subroutine is looped until each element has been 
checked 466, and then the rule is appended to the Rules List 
474 and the routine is done 476. 

FIG. 13 shows details of the general step of processing the 
autoformulas 500. The next rule in the Autoformula List is 
fetched 510. There is a test to see if the end of the list has 
been reached 520. If yes, the routine is done 540. If no, the 
rule is tested to see if it is in the Time dimension 530. If not, 
the rule is sent to the Add to Rules List subroutine 460. If the 
rule is in the Time Dimension, it is sent to the Add/Create 
TSP Rules subroutine 450. The routine loops back to get the 



procedure scheme is that user entered model rules override 
the values that are stored in the database. 

Model rules include UEVs and user entered formulas. 
Model level rules are prioritized first by their specificity, 
with highest specificity first, and lowest specificity last. 
Specificity is determined by the number of dimensions 
specified for the domain of the UEV or formula. For UEVs, 
the number of dimensions specified in the UEV address is 
the specificity. For formulas, specificity is the number of 
dimensions specified in the rule domain. If this formula is 
attached to an element, the dimension of the element is 
included as a specified dimension. 

Within one group with similar specificity, the model rules 
are prioritized by their types. UEVs are first, followed by 
formulas. Formulas, including user entered formulas and 
autoformulas, are further broken down into high-priority and 
low-priority formulas. A lowpriority formula is defined as 
one that contains a single element with no operators, or a 
formula made up entirely of sum, +and - operators. All other 
formulas aformulas. Thus, the high-priority formulas. Thus, 
the order of precedence for model rules of equal specificity 
is UEVs, high-priority formulas, and low-priority formulas. 

Database rules include database mapping rules and pro- 
moted and unpromoted autofo rmulas created fo r parent 
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next rule 510, until all rules are processed 520 and the 25 e lements in dynamically generated hierarchi es, since" these 

KAfi _ '1 . i c i • .l. j_ 



routine is done 540. 

Datamap rules are constructed from metric datamaps, 
qualifier datamaps, and model table rule domains supplied 
by the user. The metric datamap specifies the table, and 
column to be used. In addition, there are qualifier datamaps 
that can be used to select specific rows from the table 
containing the metric column. 

FIG. 14 illustrates details of the general processing of 
Metric Datamaps 600. Hierarchies and levels are gathered 
from all dimensions 605. The next rule is fetched from the 
Datamap Rules List 610, and checked to see if it is the end 
of the list 615. If so, the routine is done 620. If not, all 
elements and dimensions are set as valid for the current 
datamap 625. The next hierarchy is fetched 630 and checked 
to see if it is the last 635. If it is, the rule domain specified 
by the user for the metric datamap 's table is accessed. Any 
elements not specified in the table rule domain are turned off 
for the datamap 's rule 640, and the rule is added to the Rules 
List by the subroutine 460. If the hierarchy is not the last, the 
next level is fetched from the bottom level up 645 and tested 
to see if it is the last level 650. If so, the program loops back 
to get the next hierarchy 630. If not, the level is tested to see 
if the datamap's metric table is joinablc to the previous level 
655. If not, the next level is fetched 645, and if so, then the 
level is tested to see if this is the first level encountered in 
this dimension which is joinable 660. If not, the level's 
elements are added to the rule's domain 670. If so, all 
elements from the dimension are turned off 665 before 
adding the level's elements to the rule domain 670. If the 
rule has non-sum TSP or is marked "Do not aggregate" and 
is the lowest level of this hierarchy, then it is considered a 
low- level datamap 675 and the program loops back to get the 
next hierarchy 630. If not, the program loops back to get the 
next level 645. The routine repeats until the last datarnap 
rule has been processed 615 and the routine is done 620. 

DOLAP allows a model to include more than one rule that 
computes the same cell in the data cube. If multiple rules can 
compute the same cell, DOLAP has to choose which rule to 
apply. The rule precedence scheme described below is based 
on the premise that database-mapping rules inherently have 
a I6wcr priority than rules attached or specified for specific 
elements in the model. Another way of looking at this rule 
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represent aggregations that can be performed in the data- 
base. Database mapping rules take precedence over the 
unpromoted autoformulas to ensure that as much aggregat- 
ing of values is done in the database as possible. The 
promoted autoformulas are given higher priority than the 
database mapping rules in cases where the user has specified 
a modelie vel ru le that overrides a value that would other- 
wise be taken from the database. Putting these items just 
above the database mapping rules will ensure that the 
program does not go to the database to get values, but rather, 
uses the autoformulas to get to the usecejjterjcolvalueuM-rule. 

The following outline summarizes how rules are sorted 
into priority groups. 
Model Rules 
Specificity — highest 
UEVs 

High-priority formulas 
Low-priority formulas 

. (each level of specificity repeats the pattern) 

Specificity — lowest 
UEVs 

High-priority formulas 
Low-priority formulas 
Database Rules 
Promoted Autoformulas 
Datamap Rules 
Unpromoted Autoformulas 

FIG. 15 shows the details involved in the general step of 
ordering the rules 700. It will be understood that the method 
illustrated for ordering the rules is subject to considerable 
variation (for example, in the values assigned for primary 
and secondary sorting) and other methods of ordering the 
rules may be employed. 

The next rule is fetched from the Rules List 710. If there 
are no more rules left 720, the Rules List is sorted 770, to be 
discussed below. If there are more rules, the rule is tested to 
see if it is an autoformula 725. If so, it is assigned a primary 
sort value of zero 730 and sent to be assigned a secondary 
sort value 750, to be discussed below. If the rule is not an 
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autoformula, it is tested to see if it is a metric datamap 735. 
If so, it is assigned a primary sort value of one 740 and sent 
to be assigned a secondary sort value 750, to be discussed 
below. If the rule is not a metric datamap, the rule is assigned 
a primary sort value 745 equal to 3 times its specificity plus 
zero for a low priority rule type, plus one for a high priority 
rule type, or plus two for a User Entered Value. The rule is 
then assigned a secondary sort value 750 equal to "y", where 
y may be chosen from the following assignment scheme: 



Rule type 


Characteristic 


Value(s) 


UEV 


(all formulas) 


1 


High Priority 


(all formulas) 


1 


Low Priority 


Non-sum operators 


Dimension# 




Sum operators 


Dimension? + ffdims in model 


Datamap 


(all formulas) 


Row count 


Auto formula 


(all formulas) 


Dimension? 



The routine loops back to get the next rule from the Rules 
List 710 until all rules are processed 720. The Rules List is 
then sorted with descending primary values and ascending 
secondary values 770. The routine is then done 780. 

Even with the above structure, it is possible for two or 
more rules to end up in the same priority catagory. The 
following section describes how DOLAP resolves the situ- 
ation for each of the lowest divisions. 

Hig^-priority formulas: Multiple high-priority formulas 
will yield a conflict. DOLAP will report this conflict. Users 
must resolve these conflicts with rule domains. 

Low-priority formulas: This may or may not yield a 
conflict. The concept of commutative formula pairs is used 
to resolve conflicts. One definition used in a version of the 
present invention limits commutative formula pairs to those 
that sum their children, because such commutative formula 
pairs are mathematically guaranteed to produce equivalent 
results. Analysis of formula pairs resolves questions of 
which formula should be used and whether a conflict exists. 
Not all combinations of pairs need to be analyzed. The 
process can start with any two formulas and compare the 
"winner" with the next arbitrarily ordered formula, tossing 
away the "loser". 

The results of a pair comparison will yield one of: 

1. A true conflict. This occurs when the formulas are not 
commutative. There is no need to continue the analysis. 

2. A "winner" is determined. This occurs when the for- 
mulas are commutative, and the formulas are associated 
with different dimensions. The winner is the formula whose 
dimension has higher priority, with higher priority being 
determined by which dimension appears first in the order of 
dimensions. 

3. A commutative conflict. This occurs when the formulas 
are commutative, but are associated with the same dimen- 
sion. If this occurs on the last pair of rules to be compared, 
a conflict will result. If there are more rules remaining to be 
compared, the scanning will continue in the hope of finding 
a new "winner**, that is a commutative rule with a dimension 
with higher priority. If none is found, the conflict condition 
will result. 

For prioritizing datamap rules, when multiple datamap 
rules are available, the DOLAP server will prioritize them 
according to the last known row count that each table is 
known to have. The smaller the number of rows in the table, 
the higher priority the rule has. If two tables have the same 
row count, RSP will prioritize the two arbitrarily, and that 
order will hold consistently for as long as that version of the 
prepared rules are used. 

For prioritizing promoted/unpromotcd autoformulas, if 
there is more than one, the program will choose first the time 



.8,232 

22 

dimension formula if there is one. This is to ensure that time 
summary properties that are not summed will be computed 
first, leaving a query that can be aggregated in the database. 
If there is no time dimension formula, the program will 

5 choose from the other dimensions by dimension priority, 
with the first dimension being the highest priority. No 
conflicts are possible, because there can be at most one 
formula associated with each dimension. 
The precedence determination scheme described above 

1(J requires that the program will promote autoformula rules to 
be model level rules when any of their inputs derive from 
model level rules. This ensures that when a model level rule, 
like a UEV, is entered at a given level, the autoformulas that 
can sum that cell will take precedence over rules that retrieve 
data from the database. Tlius, the datamap rules for inter- 

15 mediate levels will not be used, and the sum that will include 
the model rule (UEV in this case) will be used. 

When the autoformula is promoted, only the parts of the 
rule domains that are affected by the model rules will be 
included with the promoted autoformula. The remaining 

20 parts of rule domain will remain with the unpromoted 
autoformula. 

The procedure to promote the rules is as follows: 

1. Split all the rules into two groups: model rules (UEVs 
and formulae) and DB rules (datamap rules and autoformula 

25 rules). The rules from the first group have higher priority 
than rules in the second group. 

2. For each model rule that can provide a value to an 
operand of an autoformula rule, find a sub-domain of the 
autoformula rule that it computes and move this portion of 

3Q the autoformula rule into the promoted autoformula rules 
group and repeat the process. 

FIG. 16 illustrates the detailed steps involved in the 
general step of promoting rules 800. The next rule is fetched 
from the Rules List 805. If there are no more rules to 
process, the routine branches down to step 850, to be 

35 discussed below. If it is not the last rule, it is tested to see if 
it is a User Entered Value 815 or a non-autoformula 820. If 
it is either of these, the rule is added to the Trigger List 825. 
If it is neither, the routine loops back to get the next rule 805. 
When the last rule has been processed, the routine switches 

40 to the Trigger List and fetches the next rule from the Trigger 
List 850. If there are no more 855, the routine is done 890, 
If not, the next generic autoformula in the Autoformula List 
is fetched 860. There is a test to see if there are no more 
autoformulas 865. If the next generic autoformula was not 

45 the last, the elements which could be computed directly or 
indirectly from the rule from the Trigger list are computed 
from the autoformula's dimension 870, and the routine loops 
back to get the next autoformula 860. If the end of the 
autoformula list has been reached 865, the autoformula list 

50 is started over by fetching the next autoformula 875, which 
is then checked for the end of the list 880. If the end of the 
autoformula fist has not been reached, a promoted autofor- 
mula rule is created 885, where its rule domain equals the 
Trigger Rule's elements from the autoformula dimension 

55 plus the Trigger Rule's elements and the affected elements 
from all dimensions. The routine loops to get the next 
autoformula 875 until the end of the list is reached 880 and 
the routine then loops back to get the next Trigger rule 850 
until the end of this list is reached 855 and the routine is done 

60 890. 

It is possible and likely that in the process of building a 
model, the user will define rules and datamaps which result 
in conflicting rules. Conflicts may be handled in the follow- 
ing manner: 

65 1) Domain Modeling Rule Set Preparation will create 
"transient" formulas, with properties available to the client 
as follows: 
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the two rules that conflicted, the Cell Explorer from the Modeler's Workbench, the data 

the time summary property for each (if needed). cell 56 of interest is determined by using the default element 

2) When the server dumps prepared rules in response to of each dimension 54 in the model 50. If the user has not 
appropriate model changes, the transient formulas will also assigned a default element for a dimension 54, Cell Explorer 
be invalidated. 5 will use the first element 58 in the dimension 54. When 

3) Because of 2), the client application must cache all rule opened from a worksheet, the Cell Explorer uses the cur- 
conflict data locally, since the data could be invalid later on. rentl y selected worksheet cell. If the worksheet has no 

4) An API returns conflict pairs of formulas, eliminating elements 58 in it (e.g. has just been created) and is the active 
duplicate pairs. No rule domain info is returned; the client ft l ^° w A f^ 06 ? Ex P Iore u r «»™» 15 disabled. 

will presumably retrieve the original rule domains from the ioTO™ 6 CeIi Ex P lorcr *? e clo f d Fde J ^ 

model using the handles provided, sometimes indirectly, (cf mfenu «7" nd or f clo T s f box m lhe U ^ T 

5, below), where the handle to the metric table would have corner . of th * Wl f™. The " ser can a *f° minimize. 

• u maximize, and restore the window using the appropriate 

to be searched for in the models table datamaps, to find the window boxes While me ^ £ lorer ^ miaimj ^ & will 

rule domain attached to the table. 5) Datamap conflict rules contimJe t0 recompute lhe data cell 56 and vahies if 

return a handle to the associated metric table. is thc auto-iecaJcjDi2a^isJurned on. 

FIG. 17 shows the detailed steps involved in the general -Lu To obtaufthe valuTofthe data cell and source cells, the 

step of resolving conflicts 900. The next rule is fetched from Cell Explorer makes a query to the DOLAP server. If the 

the Rules List 905, and there is a test for the end of the list window's auto-recalc setting is on, a query will be made 

910. If so, the routine is done 990. If not, the rule is tested whenever th e focus cell is .chan ge d, the m o del 50 cha nges, 

to see if it is a User Entered Value (UEV) 915. If so, a check 20 o r_a OEVlft is chang ed. Otherwise, cliclang the Recalc 

set consisting of all UEVs with equivalent specificity is but ton on the toolbar will cause th e^query. While the query 

created 920 and the routine branches to step 950, to be is in progress, a modal dialog will be presented allowing the 

discussed below. If not a UEV, the rule is checked for high user to cancel the query before it finishes, if desired. If the 

priority 925. If so, a check set of high priority rules with query is cancelled, the Cell explorer's auto-recalc setting 

equivalent specificity is created 930, and then the routine 25 will be changed to OFF. If appropriate model changes have 

goes to step 950, to be discussed below. If not a high priority been made, or if this is the first query to the model, the 

rule, it is tested for low priority and non-commutative DOLAP server will prepare the model's rules, and a message 

properties 935. If it is a low priority rule with non- to that effect will appear in the progress dialog, 

commutative properties, a check set of low priority rules The Cell Explorer window is divided into an upper pane 

with equivalent specificity is created 940, and the routine 30 and a lower pane within a resizable window divided by a 

goes to step 950, below. If the rule is not both low priority user-adjustable splitter bar. The top pane displays informa- 

and non-commutative, a check set of commutative low tion for the data cell of interest. Stacked vertically in the 

priority rules with equivalent specificity from the same middle of the pane are tiles/buttons representing the address 

dimension is created 945. The routine then goes to step 950, of the focus cell. The top button contains an edit text control 

referred to above, which does a pairwise comparison of all 35 with the value of the focus cell (assuming its value is being 

the rules in the set. For each pair with intersecting rule computed - see below). One of these buttons will be 

domains, a conflict rule with the rule domain equal to the selected. On the left hand side is a stack of buttons repre- 

intersection with the set is inserted ahead of the rule in the senting the children of the selected center-column element. 

Rules List 955. The references to the conflicting rules are Clicking on one of the center buttons causes that element's 

then saved 960. The routine then loops back to fetch the next 40 children to be displayed on the left hand side. The right hand 

rule 905. The routine continues until the last rule has been column contains a similar stack of buttons representing the 

processed 910 and the routine is done 990. parents of the element selected in the center column, and 

In order for the user, who may not have an extensive clicking one of the center buttons updates the right hand 

background in computer programming, to see how the value column as well. The buttons in the left and right hand 

contained in a particular cell is calculated, DOLAP 10 is 45 columns each contain a static text which contains the value 

equipped with a graphic user interface called the Cell of the cell represented by substituting that element into the 

Explorer. Thc Cell Explorer enables thc user to explore cell address of thc center column. A vertical scroll bar 

values in a model's data cube 52 easily without thc distrac- appears for thc top pane if all thc clement buttons will not 

tion of non-essential values that are usually visible in a fit in the pane, so the user can scroll up/down to bring 

worksheet. It is also useful for correcting errors in the model 50 desired element buttons into view. 

50 in later stages of development. The Cell Explorer is The lower pane of the window contains three checkboxes, 

available from the Tools |Cell Explorer menu command indicating whether or not the cell values for the child cell(s), 

whenever the Modeler's Workbench or a Worksheet is the focus cell, and parent cell(s) should be queried for and 

active window. displayed, allowing a user who is just navigating to do so 

The Cell Explorer gives the user a way of exploring the 55 without waiting for the server to compute those values. This 

c alculation process of a particular cell 56 fro m the model's pane also displays the name and expression for the rule used 

datacube 52. This is similar to having a one-celTworksheet. to compute the data cell's value. (Note that this rule will not 

In addition" to seeing the value for the data cell 56 and the necessarily refer to the elements in the Sources list.) The 

rule that was used to calculate the cell 56, the Cell Explorer contents of this field will be the same as if this cell is selected 

also enables the user to view the source and parent elements 60 in a worksheet. Typing into the focus cell's value field adds 

for selected elements that make up the data cell's 56 address. a User Entered Value (UEV) 96 to the model 50 or replaces 

In addition, like worksheets, the user can display the SQL the one already defined for that cell 56. The values for the 

Audit window to view the SQL queries generated by the Cell focus cell and sources cells will be formatted according to 

Explorer, the formatting specified for the element from the Time 

y-> The user can open multiple Cell Explorers by repeatedly 65 Summary Property dimension of the model. The lower pane 

using the Tools | Cell Explorer menu item from thc Mod- also contains a field containing the model builder's descrip- 

elcr's Workbench or from a worksheet. When the user opens tion of thc element selected in the central column. 
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The left column, titled "Sources", contains the list of the 
children elements for the selected data cell address element. 
Values for the cell's address by child element are also 
displayed. 

The top item of the center column displays the data cell's 5 
computed value. If the cell returns an error, "Error" will be 
displayed in the value. Also, the display shows the data cell's 
address as a list of the elements (buttons) that make up the 
cell address. Elements which have children are displayed 
with an arrow icon to the left of the element's name. 10 
Elements that are used by other elements (parent elements) 
are displayed with an arrow icon to the right of the element's 
name. 

The right column, titled "Uses", shows the elements 
whose values can be computed from the selected data cell 35 
address element. 

When the Cell Explorer is first opened, a data cell address 
is already specified. However, the user can change the data 
cell by modifying the elements in the data cell address 
portion of the window. The address of UEVs entered in the 20 
Cell Explorer is specified in the same number of dimensions 
as the Cell Explorer itself, with all other dimensions wild- 
carded. 

To specify a data cell, the following operations may be 
performed: To use a Source or Uses element: 25 

1. Select a data cell address element by depressing the 
button. 

2. If Source or Uses elements are available, they are 
displayed in their respective lists. 

3. Double-click on a Source or Uses element. Cell 30 
Explorer replaces the currently selected data cell address 
element with the Source or Uses element. The data cell's 
value is then queried for and displayed and the new rule is 
displayed, according to the window's auto-recalc setting. 

To add address elements: 35 

1. Drag an element from the Modeler's Workbench or the 
worksheet and drop it on the data cell address portion of the 
window. The user can also paste an element from the 
clipboard. If the dimension for the new element is already 
referenced in the data cell address, the new element replaces 40 
the existing element for that dimension. If the dimension is 
not currently referenced, the new element is added to the 
address. 

2. The data cell's value is recomputed and t he ne w rule i& 
displayed, according to the window's auto-recalc setting. 45 
To delete address elements: 

1. Select an clement in the data cell address. 

2. Press the Del key or use the Edit | Delete menu 
command. 

3. The data cell's value is recomputed and the new rule is 50 
displayed, according to the window's auto-recalc setting. 
Note that since deleting an element also removes the dimen- 
sion the element belongs to, the dimensionality of the data 
cell is reduced. 

Hyperstructure Query Language (HQL) is used for que- 55 
rying the DOLAP system for data. It is a data manipulation 
language only, not a data definition language. Data definition 
is accomplished via the DOLAP Application Programming 
Interface. HQL is a "technical language", like SQL. HQL 
queries will usually be created automatically by software, by 60 
the DOLAP Client Application Worksheet or 3rd party 
report generators. Like SQL, HQL may be used in a sim- 
plistic manner by naive users, in a more sophisticated 
manner by power users, and in its full power and complexity 
only by experts. 65 

A query expressed in HQL, the Hyperstructure Query 
Language, is sent to the DOLAP Server for an open DOLAP 
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model. Queries fetch cell values from the datacube, fetch 
lists of qualifying elements (optionally with their selected 
attribute values), or fetch lists of unique attribute values. 
Model metadata is retrieved using the API, not queries. 
Query processing can take place in both the DOLAP Server 
and the database server, depending on the amount of opti- 
mization that DOLAP can achieve. Every effort is made to 
"push" computation into the database server, to take advan- 
tage of its computational power and to decrease bandwidth 
demands on the network hosting the DOLAP and database 
servers. 

Query language supports grouping operator: Q. 

The DOLAP Client interprets quotes in HQL and custom 
formulas as follows: two adjacent double quotes translates to 
one double quote as part of an element name; a double quote 
not adjacent to another double quote is treated as a grouping 
operator. 

When elements in different dimensions have the same 
name, then the user must disambiguate them using the 
syntax <dimension-name>.<element-narae> 

Numeric constants and string constants may be used in 
query specifications. 

It can support "cutoff" type element query functions: 
<element-list>WHERE "TOP" or "BOTTOM"<integer> 

Elements are disambiguated by <dimension- 
name>.<element-name>Query language includes structural 
element query function: ChildrenOf, returns elements in the 
order defined by the element hierarchy, except for TDE's. 

Query language includes structural element query func- 
tion: DescendantsOf. 

Query language includes structural element query func- 
tion: LeavesOf. 

Query language includes structural element query func- 
tion: ParentsOf (INDENT?) 

Query language includes structural element query func- 
tion: Parent Of. 

Query language includes structural element query func- 
tion: RootsOf 

Query language includes structural element query func- 
tion: AncestorsOf. 

Supported decimal separators are periods. 

Supported list separators are commas. 

A query for ChildenOf(dimension) will return the root 
elements of the dimension in the order they appear in the 
MWB. 

Element attribute comparisons are by default case 
sensitive, for search and sorts. 

Support element with (list of) attribute^) and value(s) 
query. 

The result of a metric comparison between a cell value 
and a multi-dimensional tuple is true if at least one of the 
individual comparisons is true. 

Interpretation of the following items is case-insensitive: 
logical database name, model names, worksheet names, info 
space names, BLOB names, dimension names, level names, 
element names, attribute names, HQL keywords, custom 
function names, and custom formula keywords. 

HQL supports UPPER and LOWER functions applied to 
attributes in WHERE and ORDER BY clauses; this enables 
case-insensitive attribute sorts and comparisons. 

The user can query by element name for transients by 
using the "name" attribute 

Numeric sort: nulls and errors ordered first for ascending 
sorts and last for descending sorts, sub-sorted by internal 
error numbers assigned 

String sort: nulls will be sorted first for string comparisons 
for ascending sorts and last for descending sorts. 
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HQL is used by any application (Sybase or 3rd party) via 
the DOLAP API. In the DOLAP Client Application, HQL 
may be visible in error log file entries, the I HQL Window 
and the SQL Audit window. The Hyperstructure Query 
Language, HQL, is how a request for such data is expressed. 

There is one "entry point" to the Query Language Gram- 
mar. This is the symbol query in the language grammar 
below. There are four variants of a query: 

cell-query (yields cells labeled with elements) 

element-query (yields a list of elements from a 
dimension) 

element-attribute-value-query(yields a list of elements 
and one or more attribute values for each) 

attribute-value-query (yields a list of unique attribute 
values). Most queries are cell-queries, since DOLAP is most 
often used to analyze the numbers in the multidimensional 
datacube described by a model. The DOLAP Client Appli- 
cation Worksheet generates cell-queries to retrieve the num- 
bers it displays. Element, element-attribute-value, and 
attribute-value queries will typically be used by third party 
metadata exploration tools such as the so-called "knock- 
down browser". The attribute mechanism allows searching 
the element space (the DOLAP metadata) by relational data 
available in the SQL Database only. 

The most common use of element, element-attribute- 
value, and attribute-value queries is as sub-queries embed- 
ded within a cell query, generally created and issued pro- 
grammatically through the DOLAP API by 3rd party client 
applications. 

Queries involving transient dynamic elements (TDEs) are 
specified and processed just like those involving non-TDEs, 
except that: 

1. TDEs cannot be specified by name in ElementSelects. 

2. They cannot be queried for directly using an element 
query. This can be accomplished by specifying a "name" 
attribute for the transient level of the dynamic hierarchy, 
associating it with the same column that specifies the name 
for the TDEs, and using an element-attribute-value query. 

3. Results will not be cached. 

Grammar for HQL (Extended Backus-Naur Form) 
Basic Blocks 

Literals: integers, numbers including standard scientific 
notation, string constants, which are any sequence of 
characters in single quotes, dates, and identifiers, which 
can be either a regular identifier which starts with a letter, 
or a delimited identifier which is any sequence of char- 
acters enclosed in double quotes. 

Additionally, there are the following keywords: 



ANCESTORS OF 


AND ASCENDING ATTRIBUTE 


ATTRIBUTES 






BOTTOM 


BY 




CELL CHILDREN OF 






DESCENDANTS OF 


DESCENDING 


DISTINCT 


ELEMENT NAME 


ERROR 




FROM 






INVALID REFERENCE 






LEAVES OF 


LEVEL 


LOWER 


MATH ERROR 


NOT 


NULL 


ONLY OR 


ORDER 


ORDERED 


PARENT OF 


PARENTS OF 




ROOTS OF 


RULE CONFLICT 


RULE MISSING 


SELECT 






TOP 






UPPER 


USER DEFINED ERROR 




WHERE 


WITH 





Keywords are case-insensitive 

Whitespace (any sequence of spaces, tabs, carriage 
returns, line feeds, and/or form feeds) is ignored except 
within string constants or quoted identifiers. 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



Referencing Elements, Dimensions, Levels and Attributes 
A dimension is just an identifier: 
Dimension: 
Identifier 

An element reference is either an element name alone, or 
an element name together with the dimension name, in the 
form <dimension-name>.<element-name>. This form must 
be used when the same element name is used in multiple 
dimensions, where the parser cannot otherwise determine 
the dimension of an element. 
Element 

ElementName 

Dimension. ElementName 
ElementName 

Identifier 

A level reference is either a name, or a name qualified with 
a dimension reference: 
LevelRef. 
LevelName 

Dimension . LevelName 
LevelName 

Identifier 
An attribute name is just a name 
AttributeName 

Identifier 
Query Types 

HQL allows four different query types 
Query: 

Cell Query 

Element Query 

ElementPlusAttributesQuery 

AttributeQuery 
Cell Queries 

Cell Query returns the cells specified by a cells list. A 
single query can consist of multiple cell specifications. Each 
cell query specification defines a tuple domain. 
CellQuery 

SELECT 1\ipleDomain 
TupleDomain 

Tuple 

( r l\iplelist) 
TupleList 
'Aiple 

T\iplcList, Tuple 
Note that the list of tuples is enclosed in parentheses, that is, 
parentheses are optional when there is one tuple, but are 
required when there is more than one. Tuple: 
A tuple is a | — separated list of element selections, 
enclosed in square brackets: In this version of the preferred 
embodiment, each element selection within a tuple must be 
from a different dimension than any of the others. 
Tuple 

[DimensionSelectList] 
DimensionSelectlist 

ElementsSelectList 

DimensionSelectList | ElementsSelectList 
Element Queries 
An element query returns a list of elements. 
ElementQuery 

SELECT ElementsSelectList 
ElementsSelectList 

ElementsSclect 

ElementsSelectList, ElementsSclect 
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Elements Selection 

An elements selection is defined by the elements bound- 
ing list specification, an optional filtering criteria, and an 
optional ordering criteria. Tlie elements must all be from the 
same dimension. The parser must be able to determine the 5 
dimension of the first element reference unambiguously, or 
an error will result. Thereafter, all element references are 
assumed to be from that dimension. So, in a model with both 
BillTo and ShipTo dimensions, in the expression 
ShipTo.Boston, Chicago, New York, it's assumed that Chi- 10 
cago and New York refer to elements in the ShipTo dimen- 
sion. Should either of them not exist in that dimension, an 
error would result. 
ElementsSelecl 

ElementBoundingList Element Where(opt) 15 
ElementOrder(opt) If no element survives the Element- 
Where filtering specification, an empty list is returned. 
Elements Bounding List 

The elements bounding list specifies the set of elements 
from which a smaller subset is going to be filtered. It can be 20 
an element name, a function, an element selection in 
parentheses, or a level specification. 

ElementBoundingList 
Element 

ElementFunction 25 
Dimension 
( ElementsSelect) 
LEVEL LevelRef 
Element functions are as follows 
ElementFunction: 30 
ElementFuncChildren 
ElementFuncParentsn 
ElementFuncDescendants 
ElementFuncAncestorsn 

ElementFuncLeavesn 35 
ElementFuncRootsn 

[ElementFuncRelevantChildren-removed for Release 
1] 

ElementFuncChildren: 4Q 

ChildrenOf (LevelNumber(opt) 
ElementListOrDimension) 
ElementFuncParents: 

ParentsOf (LevelNumber(opt) ElemcntsSelectList) 
ElementFuncDescendants: 45 

DescendantsOf (LevelNumber(opt) 
ElementListOrDimension) 
ElementFunc Ancestors: 

AncestorsOf (LevelNumber(opt) ElementsSelectList) 
ElementFuncLeaves: 50 

LeavesOf ( ElementListOrDimension) 
ElcmcntFuncRoots: 

RootsOf ( ElcmentSclcctList) 
[ElementFuncRelevanlChildren 55 

RelevantChildrenOf (ElementListOrDimension, 
Tuple) 
LevelNumber: 

integer, 

ElementListOrDimension is a non-terminal symbol stand- 60 
ing for either a list of elements or a dimension. It will require 
semantic analysis (there can be either a list of elements or a 
single dimension name, and these cases cannot be distin- 
guished syntactically, and a single element can be either 
dimension name or an element name): 65 
ElementListOrDimension 

Dimension 
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ElementsSelectList 

In this version of the preferred embodiment, all the 
elements must belong to the same dimension. 

In this version of the preferred embodiment, a dimension 
name cannot be combined with an element list. 
Element Selection Criteria 

An element selection criteria is either a Boolean expres- 
sion or a ranking function. 

Element Where: 

WHERE ElementWhereExpr 

WHERE ElementWhereRank 
Ranking function 

Analogous to the corresponding SQL function. 

ElementWhereRank 

TOP integer BY Tuple 

BOTTOM integer BY Tuple 
Semantics 

Each element from the bounding list is concatenated with 
Tuple to form a cell address. The array of the values obtained 
this way is sorted, and the elements that correspond to the 
<integer>top/bottom values are returned. Null and error 
elements are never considered for selection. 

Note that the number of returned elements can be more 
than <integer> (if there are multiple elements in a list with 
the same value for the examined cell), the number of 
returned elements may be less than <integer> (there are not 
enough elements in a list) and duplicate elements are carried 
over to the returned list 

In this preferred embodiment, the tuple cannot contain 
any element in the queried dimension. 

In this preferred embodiment, the tuple has to contain 
exactly one element in each of the dimensions specified in 
it. 

Element Selection By Expression 

This is a logical expression, with elementary factors of the 
following kinds: 

cell value comparisons and range 

attribute value comparisons 

element name comparisons 
In this preferred embodiment, it is not possible to compare 

a cell value with an attribute value. 
ElementwhereExpr 

ElementWhereTerm 

ElementWhereExpr OR ElementWhereTerm 
Element WhereTerm 
ElcmcntWhercFactor 

Element WhereTerm AND Element WhereFactor 

Element WhereFactor 
NOT(Opt) ElementWhereTest 

Element WhereTest 
Elemc ntWhcre Primary 
(Element hereExpr) 

ElemenlWherePrimary 
Element Where ValueComp 
ATTRIBUTE ElementWhereAttributeComp 

Note that since an ElementWhereExpr can be embedded in 
another ElementWhereExpr, and since both element que- 
ries and cell queries are building blocks for a 
ElementWhereExpr, it is therefore possible to embed a 
cell query inside an element query, or an element query 
inside another element query. 

Semantics: When a tuple is being compared to a number, 
the cells whose values will be compared to the number arc 
identified by concatenating the tuple with each clement in 
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the bounding list. If two tuples are being compared, the 
actual tuples to be compared are constructed by concatenat- 
ing each tuple with each element from the bounding list. At 
least one of the resulting tuples must then evaluate to a 
single cell for each element from the bounding list. 
Element Where ValueComp 

Tuple OpCompare Number 

Number OpCompare Tuple 

Tuple OpCompare Tuple 

Tuple IS NOT(Opt) NULL 

Tuple IS NOT(Opt) ERROR 

Tuple IS NOT(Opt) RULEMISSING 

TVple IS NOT(Opt) RULECONFLICT 

ttiple IS NOT(Opt) MATHERROR 

Tuple IS NOT(Opt) INVAUDREFERENCE 

T\iple IS NOT(Opt) USERDEFI NED ERROR 
OpCompare 

< 

<= 
> 

>= 
<> 

In this version of the preferred embodiment, the tuple 
reference in a comparison expression cannot contain any- 
thing in the bounding list dimension. 
Note that if we need to choose the elements for which all the 
vector components satisfy the condition, this can be 
achieved by prepending NOT to the negated condition: 
SELECT ChildrenOf(All Customers) WHERE 

[Sales | "1994" | ChildrcnOf(All Products) >5000 

returns the list of the customers who bought more than 
$5000 worth of any product in 1994, whereas 
SELECT ChildrenOf(AlI Customers) WHERE 

NOT tfSales fl^'IChildreiiOftAll Products)]<-5000) 

returns the list of the customers who bought more than 
$5000 worth of every product. 
Attribute Value Comparisons 
ElementWhereAttributeComp 
AttributeRef OpCompare AttributeLiteral 
AttributeLiteral Op Compare AttributeRef 
[AttributeRef LIKE RcgularExpression — removed for 
Rl] 

[AttributeRef IN (AUributcLitcralList)-— removedfor 
Rl] 
AttributeRef 
LevelName.AttributeName 
UPPER (LevelNamcAttributeName) 
LOWER (LevelName.AttributeName) 
UPPER and LOWER are used in WHERE and ORDER BY 
clauses to enable case-insensitive attribute sorts and com- 
parisons. Otherwise sorts and comparisons arc case- 
sensitive. 

AttributeLiteral 

Number 

String 

Date 

NULL 

Two attributes cannot be compared to each other. 

In this version of the preferred embodiment, only one 
level may be specified in a where clause. All elements that 
arc not part of that level arc rejected. 
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Element Ordering 
Elements can be ordered in the following ways: 
by the cell values and/or attribute values 
5 by element name 

according to their order in the Elements Bounding List 

ElementOrder 

ORDER BY ElemcntOrderByNameCellOrAttribute 
OrderDirection(opt) 
10 ElementOrderByNameCcllOrAttribute 
ELEMENTNAME 
Tuple 

ATTRIBUTE AttributeRef 
15 OrderDirection 

ASCENDING 

DESCENDING 
The elements can be ordered either in ascending or descend- 
ing order. The default order is ASCENDING. If there is no 
20 sorting specification, then the order is determined by the 
order of the elements in the bounding list. Semantics: The 
specified tuple must have a single element in each specified 
dimension. The bounding dimension can not be specified. 
Element-Attribute Queries 
25 An element attribute query returns a list of elements and 
one or more attribute(s) for each. 
ElementPlusAttributeQuery 

SELECT ElementsSelect WITH ATTRIBUTES 
AttributeNameList AttrOrder(opt) 
30 AttributeNameList 

LevelNamcAttributeName 

AttributeNameList, LevelName.AttributeName 
AttrOrder 

ORDER BY AltributeRejList OrderDirection(opt) 
35 AttributeRefList 
AttributeRef 

AttributeRefList, AttributeRef Elements from the element 
select list which do not have the specified attributes will be 
filtered out of the returned list (as opposed to return a NULL 
40 value for the attribute^)). 
Attribute Queries 

An attribute query returns a list of unique attribute values. 
AttributeQuery 

SELECT DISTINCT ATTRIBUTE Attribute FROM 
45 ElementsSelect AttrOrdered(opt) Note that elements in the 
ElementsSelect list which do not have the specified attribute 
will be filtered out of the results. 
Attribute: 

LevelRef. AttributeName 
50 AttrOrdered 

ATTRIBUTE ORDER OrderDirection 

In this version of the preferred embodiment, the Attribu- 
teRef can only refer to the Attribute that is being selected. 

In this version of the preferred embodiment, the LevelRef 
55 in AttributeQueries must contain the dimension. 

Other features of the DO LAP system include an Interac- 
tive Hyperstructure Query Language facility, which enables 
the user to submit queries against a model and view the 
results without using a worksheet. It is an effective way of 
60 issuing queriesby entering the textual form of the query 
instead of dragging and dropping elements, as would be 
done building worksheets. 

The Distributed OLAP system 10 thus provides the kind 
of computer on-line analytical processing and modeling 
65 system that can utilize large relational data bases and at the 
same time offer large-scale modeling. System 10 requires 
less administrative attention to maintain, supports many 
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interactive users and allows easy insertion of hypothetical 
values to produce "what-if ' analysis. Control is distributed 
to the end-users, rather than being centrally controlled by an 
IT department. This end-user control has been accomplished 
by providing a system in which models can be easily and 
rapidly constructed by end users who need not be computer 
programmers. System 10 provides an HQL language that 
can be used to query these models which have been con- 
structed in accordance with rule sets establishing the 
domains in which rules apply. Hierarchies of priorities are 
maintained among user-entered values and formulas, data- 
maps and autoformulas. System 10 also provides user inter- 
face tools such as the Cell Explorer, which allow users to 
audit the means by which the value of each cell is calculated. 
This allows users to understand the workings of the models 
more fully, and to participate in the building and querying of 
models in a much more informed manner. All of these 
components contribute to making control of data modeling 
truly "distributed" among the users. 

In addition to the above mentioned examples, various 
other modifications and alterations of the inventive system 
10 may be made without departing from the invention. 
Accordingly, the above disclosure is not to be considered as 
limiting and the appended claims are to be interpreted as 
encompassing the true spirit and the entire scope of the 
invention. 

Industrial Applicability 

The present Distributed On-Line Analytical Processing 
system 10 is well suited for computer modeling of processes 
in any number of business applications. The ability to plan 
for contingencies can be crucial to the survival of a business 
in an increasingly complex world. As competition from 
foreign competitors grows, the number of variables affecting 
a company's performance has grown greatly. The ability to 
make predictions based on complex combinations of vari- 
ables can have tremendous applications in many industries. 
It is this handling of complex variables that the present 
invention 10 is designed to facilitate by forming multidi- 
mensional computer models 50. 

The present invention 10 calculates cell values "on the 
fly" in response to queries, and thus does not require the 
massive loading and pre-calculation of the entire data cube 
with accompanying large storage requirements, as in 
MOLAR The present invention 10 is easy to administer, can 
support models 50 with very large dimensions 54, and can 
also support 15-20 or more dimensions 54 with almost 
unlimited subcubes 66. The present invention 10 allows for 
distributed control of business analysis processes while 
allowing full configurability of access rights, and supports 
advanced interactive analysis of large data warehouses by 
large user groups. Due to a unique modeling approach called 
"Domain Modeling™", users with no background in com- 
puter programming can design the methods by which cell 
contents arc formulated. Users may also enter hypothetical 
values for easy contingency modeling and "what-if' analy- 
sis. 

For just one example among very many, as seen in FIG. 
3 previously, a holding company called Amalgamated hav- 
ing subsidiaries Acme, Bingo and Coronet could use the 
present invention 10 to create a datacube model 52 com- 
posed of cells 56 having the dimensions of Time 60, 
Organization 62 and Finance 64. The Time dimension 60 
could contain elements 1996, Ql-1996, Q2-1996, Q3-1996 
and Q4-1996; the Organization dimension 62 could contain 
elements Amalgamated, Acme, Bingo and Coronet; and the 
Finance dimension 64 could contain elements Profit, Rev- 
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enue and Expenses. As part of the model building process, 
rules for manipulating data are established. In this example, 
the combined Revenue of the three subsidiaries are added 
for any one time period to give the Revenue for Amalgam- 

5 ated during that same time period. Similarly, Revenue for the 
element 1996 in the Time dimension is the sum of the 
Revenues for the four quarters Q1-Q4 for any particular 
Organization. The data for these calculations can be 
retrieved from databases 16 by the use of data maps. This 

10 data can further be processed by the use of formula rules or 
rule domains 60 associated with particular cells 56 or groups 
of cells. As a further example, another element 58 in the 
Time dimension 62 could be "1997", in which Revenue is 
projected to be 125% of the total 1996 value of Revenue. 

15 Therefore, a formula is established which will take the 1996 
Revenue value and multiply it by 1.25. 

By allowing entry of User Entered Values 96, the values 
otherwise calculated can be overridden to allow hypothetical 
scenarios to be viewed. In this example, different scenarios 

20 could be set up to test the effects of, for example, rising tax 
rates or declining prices of materials, which could be 
reflected in the Expenses element of the Financial dimension 
65. Similarly, sales projections for one or more subsidiary 
companies could be entered in the Revenue element of the 

25 Financial dimension 65 to see how the performance of the 
Amalgamated Company as a whole would be affected. In 
this same manner, combinations of variables can be entered 
by use of User Entered Values 96 to make projections of 
company performance which would be difficult or perhaps 

30 impossible by other methods. The number of dimensions 54 
available for model creation and use are not limited to the 
three represented by a cube 52 in the above example. It is 
possible to create very complex models of many more 
dimensions, thus allowing decisions to be made on the basis 

35 of a great number of variables. 

The present invention 10 is useable by almost any busi- 
ness which deals with a number of changing variables, and 
in which the ability to make informed decisions from 

4Q projections based on data is important. It is perhaps difficult 
to think of businesses in which these qualities do not apply. 
In particular, banking and investing firms can plan for the 
influence of changing interest rates or tax rates or even 
passage of new regulatory legislation. Manufacturers can 

45 use the present invention 10 to plan for the effect of business 
expansion by themselves or competitors, or introduction of 
new product lines. Merchants can use the present invention 
10 to plan sales orders. 

Prior art computer models required considerable time and 

50 expertise to establish and maintain them. The ease of use of 
the present invention opens up many applications to com- 
panies which may not have large Information Transfer 
departments. The fact that models can be developed by 
non-programmers allows many users (whose expertise may 

55 lie in other fields) to have access to powerful modeling 
techniques. Since the formulas used in calculating a particu- 
lar cell value can be easily viewed, the basis of any projec- 
tion can be discussed and analyzed with greater efficiency 
and understanding. Most importantly, new ideas can be 

60 attempted first in the modeling stage, rather than in the real 
world where experimentation can be costly and risky. Thus, 
by using the present invention 10 innovation is encouraged. 

For the above, and other, reasons, it is expected that the 
DOLAP system 10 of the present invention will have 

65 widespread industrial applicability. Therefore, it is expected 
that the commercial utility of the present invention will be 
extensive and long lasting. 
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What is claimed is: 

1. A method of modeling a plurality of variables in a 
hypcrstructurc in a computer modeling system having at 
least one data storage means, a data processing means, input 
means, and output means, the steps comprising: 

(A) obtaining measurements of physical objects and 
activities which are related to the entity to be modeled 
in the computer hyperstructure; 

(B) transforming said measurements into computer data 
which corresponds to the physical objects and activities 
external to the computer system; 

(C) constructing a plurality of independent dimensions 
from said computer data within a hyperstructure, where 
each dimension has at least one element; 

(D) creating a plurality of cells, each of which is associ- 
ated with the intersection of at least two elements, each 
cell being capable of storing at least one value; 

(E) associating at least one rule domain with at least one 
cell, said rule domain including at least one means for 
assigning values to the associated cells; and 

(F) preparing a domain modeling rule set which deter- 
mines which of said assigning means will provide the 
value associated with each of said cells, wherein the 
application of said domain modeling rule set to the 
hyperstructure causes a physical transformation of the 
data corresponding to said physical objects which are 
modeled in said hyperstructure. 

2. The method of computer modeling of claim 1 wherein: 
the means of assigning values in step (E) is a rule selected 

from the group consisting of User Entered Formulas, 
autoformulas, User Entered Values, and datamaps. 

3. The method of computer modeling of claim 1, wherein 
step (F) further comprises: 

creating a Do Not Aggregate list, a Datamap Rules List, 
an Autoformula List and a Rules List. 

4. The method of computer modeling of claim 3 wherein 
step (F) further comprises: 

(a) gathering the rules; 

(b) processing the autoformulas; 

(c) processing the datamaps; 

(d) ordering the rules; 

(e) promoting the rules; and 

(f) processing rule conflicts. 

5. The computer modeling method of claim 4, wherein 
step (a) includes: 

(1) building a Do Not Aggregate List; 

(2) getting the next rule; 

(3) if the rule is a datamap, then saving the datamap in the 
Datamap Rules List; 

(4) if the rule is an autoformula, then adding the rule to the 50 
Autoformula List; 

(5) if the rule is an aggregation, then creating a time 
summary property rule and adding the time summary 
property rule to the Rules List; 

(6) else adding the rule to the Rules List; and 

(7) repeating steps (2) through (6) until the last rule has 
been processed. 

6. The computer modeling method of claim 5, wherein 
step (5) includes: 

(a) getting the next time summary property (TSP) from 
the model; 

(b) creating a new rule for that TSP; 

(c) adding this new rule to the Rules List; and 

(d) repeating steps (a) through (c) until all TSPs have been 
processed, and returning. 

7. The computer modeling method of claim 6, wherein 
step (c) includes: 
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(i) if the rule is not a list formula, then appending the rule 
to the Rules List, and returning; 

(ii) if the rule is a list formula, then fetching the first next 
element in the rule domain associated therewith; 

(iii) if said first next element does not have a "Do not 
aggregate" property, then fetching the second next 
element; 

(iv) if said first next element has a "Do not aggregate" 
property, and said first next element is in this rule's 
dimension, then fetching the second next element, else 
removing said first next element from the formula's 
domain; and 

(v) repeating steps (ii) through (iv) until all elements have 
been processed, then appending the rule to the Rules 
List and returning. 

8. The computer modeling method of claim 4, wherein 
step (b) includes: 

(1) getting the next rule in the Autoformula List; 

(2) if the rule is not in the Time dimension, then adding 
the rule to the Rules List; 

(3) else creating a time summary property rule and adding 
the time summary property rule to the Rules List; and 

(4) repeating steps (1H3) un^l the last rule has been 
processed. 

9. The computer modeling method of claim 4, wherein 
step (c) includes: 

(1) gathering hierarchies and levels from all dimensions; 

(2) getting the next rule from the Datamap Rules List; 

(3) setting all elements and dimensions as valid; 

(4) getting the next hierarchy; 

(5) getting the next level; 

(6) if this level is joinable to the previous level, then 

(a) if the level is first in its dimension, then turning off 
all elements in this dimension and adding this level's 
elements to the rule's domain; 

(b) if the level is not first in its dimension, then adding 
the level's elements to the rule's domain; 

(c) in either case (a) or (b), then 

(i) if datamap is a low level datamap, then getting the 
next hierarchy; 

(ii) if datamap is not a low level datamap, then 
getting the next level; 

(7) if this level is not joinable to the previous level, then 
getting the next level; 

(8) repeating steps (5), (6) and (7) until all levels in a 
hierarchy are processed; 

(9) repeating steps (4)-(8) until all hierarchies in a rule are 
processed, then turning off all elements not in the 
table's rule domain and adding the rule to the Rules 
List; and 

(10) repeating steps (2)-{9) until all rules are processed. 

10. The computer modeling method of claim 4, wherein 
step (d) includes: 

(1) getting the next rule from the Rules List; 

(2) assigning a primary sort value depending on rule type; 

(3) assigning a secondary sort value depending on rule 
type; 

(4) repeating steps (1M 3 ) each rule has been 
assigned primary and secondary sort values; and 

(5) ordering the Rules List by primary and secondary sort 
values. 

11. The computer modeling method of claim 4, wherein 
step (e) includes: 

(1) establishing a Trigger List; 

(2) getting the next rule from the Rules list; 
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(3) if the rule is a user entered value, or a user entered 
formula, then adding the rule to the Trigger List; 

(4) else getting the next rule from the Rules List; 

(5) repeating steps (2)-(4) until the last rule has been 
processed; 

(6) getting the next rule from the Trigger List; 

(7) getting the next autoformula from the Autoformula 
List; 

(8) computing the affected elements from the autoformu- 
la's dimension which are superceded by the Trigger 
List to make a modified autoformula list; 

(9) repeating steps (6) and (7) until all autoformulas are 
processed; 

(10) getting the next autoformula from the modified 
autoformula list; 

(11) creating a promoted autoformula rule; 

(12) repeating steps (10) and (11) until all autoformulas 
are processed; and 

(13) repeating steps (6)-(12) until all rules have been 
processed. 

12. The computer modeling method of claim 4, wherein 
step (£) includes: 

(1) getting the next rule from the Rules List; 

(2) creating a check set consisting of all user entered 
values with equal specificity; 

(3) creating a check set consisting of all high priority rules 
with equivalent specificity; 

(4) creating a check set consisting of all non-commutative 
rules with low priority with equivalent specificity; 

(5) creating a check set consisting of all remaining 
commutative, low priority rules with equivalent speci- 
ficity from the same dimension; 

(6) if any rule's domain intersects any rules in the check 
sets, then inserting a conflict rule with a rule domain 
equal to the intersection region ahead of the rule in the 
list, and saving references to the conflict rule; 

(7) if there is no intersection, then getting the next rule 
from the Rules List; and 

(8) repeating steps (l)-( 7 ) unt *l a11 niles have been 
processed. 

13. The method of computer modeling of claim 1, the 
steps further comprising: 

(G) providing a computer data manipulation language 
suitable for expressing queries to obtain data from a 
hyperstructure. 

14. The method of computer modeling of claim 1, the 
steps further: 

(H) providing a user interface means by which the method 
of calculation of each cell value can be observed. 

15. A method of querying a multidimensional computer 
modeling data structure in a computer system having at least 
one data storage means, data processing means, input 
means, output means, a hyperstructure constructed in one of 
said data storage means, a calculation engine, a domain 
modeling rule set preparation module, a query engine, and 
an evaluator which communicates with an SQL generator, a 
math library, a cache retrieval device and a sort and search 
processor, comprising the steps of: 

(A) inputting a query; 

(B) parsing the query; 

(C) creating a query component tree; 

(D) inputting model metadata into said domain modeling 
rule set preparation module; 
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(E) generating a domain modeling rule set from said 
domain modeling rule set preparation module; 

(F) inputting said domain modeling rule set into said 
calculation engine; 

(G) generating an execution tree with rules from said 
calculation engine; 

(H) inputting said execution tree with rules to said query 
engine; 

(I) generating an optimized execution tree from said query 
engine; 

(J) inputting said optimized execution tree to said evalu- 
ator; 

(K) communicating between said evaluator and a data 

reference means; 
(L) generating an execution tree with results; 
(M) repeating steps (C)-(L) if there had been any inner 

queries presented before the query as a whole can be 

addressed; 
(N) packaging the query results; and 
(O) oulputting query results. 

16. The method of querying of claim 15, wherein: 
said data reference means is selected from the group 

consisting of an SQL generator which communicates 
with a Relational Database Management System, a 
math library, a sort and search processor, and a cache 
retrieval means which communicates with a multi- 
dimension cache. 

17. The method of querying of claim 15, wherein step (E) 
further comprises the steps of: 

(a) gathering the rules; 

(b) processing the autoformulas; 

(c) processing the datamaps; 

(d) ordering the rules; 

(e) promoting the rules; and 

(f) processing rule conflicts. 

18. A computer system for modeling a plurality of vari- 
ables in a hyperstructure comprising: 

at least one data storage means; 
data processing means; 
input means; 
output means; 

measurements of physical objects and activities which are 
related to the entity to be modeled in the computer 
hyperstructure which arc transformed into computer 
data which corresponds to the physical objects and 
activities external to the computer system; 

a plurality of independent dimensions within the hyper- 
structure constructed from said data, where each 
dimension has at least one element; 

a plurality of cells, each of which is associated with the 
intersection of at least two elements, each cell being 
capable of storing at least one value; 

at least one rule domain associated with at least one cell, 
each said rule domain including at least one means for 
assigning values to the associated cell; and 

a domain modeling rule set for determining which of said 
assigning means will provide the value associated with 
each of said cells, wherein the application of said 
domain modeling rule set to the hyperstructure causes 
a physical transformation of the data corresponding to 
said physical objects which are modeled in said hyper- 
structure. 
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